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^ (57) Abstract: The present invention relates to a system that allows the specific regulation of a gene in a eukaryotic cell. The system 
g comprises a novel repressor gene construct (shown in Fig.l), wherein the construct comprises a promoter (stippled box) operably 

linked to a rabbit 6-globin inlron2/exon3 sequence (open box) which is operably linked to a modified lac repressor coding region 
Q which is operably linked to rabbit B-globin 3' untranslated sequences (solid bar). The modified lac repressor coding region is made 

up of segments that are identical to the wild type bacterial sequence (crosshalched box), and segments that have been reencoded to 
^ use mammalian codons (striped box). 
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A Lac Operator-Repressor System 

US Government Rights 

This invention was made with United States Government support 
5 under Grant No. NCRR 1 1 1 02 and MH 1 2406, awarded by National Institutes of 
Health. The United States Government has certain rights in the invention. 

Related Application 

This application claims priority under 35 USC § 199(e) to US 
10 Provisional Application Serial Nos. 60/273,480, filed March 5, 2001, and 60/281,322, 
filed April 4, 2001, the disclosures of which are incorporated herein. 

Field of the Invention 

The present invention is directed to a system for regulating the 
1 5 expression of a gene in an animal. The system comprises a novel gene construct that 
encodes a repressor protein wherein the repressor functions to bind to the operator 
region of a recombinant gene and inhibit transcription of therecombinant gene in an 
animal. Expression of the gene is increased upon the administration of an exogenous 
inducer agent to the animal, wherein the inducing agent causes the removal of the 
20 repressor from the operator. 

Background of the Invention 

Jacob and Monod described the system of structural and regulatory 
elements that make up the lac operon of K coli. This set of genes is coordinately 

25 regulated by lactose, a metabolite used by bacteria as an energy source when it is 
present in their environment. The regulatory components of the system are the lac 
repressor and its DNA binding sequence, the lac operator. These two elements 
control the transcription of the rest of the genes in the lac operon that encode enzymes 
necessary for lactose metabolism. In the absence of lactose, the lac repressor occupies 

30 the lac operators, altering the structure of the promoter in the region of the RNA 
polymerase binding site, and preventing transcription. Lactose causes a 
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conformational change in the repressor and it vacates the operators, allowing RNA 
polymerase to gain access to the promoter and initiate transcription. 

Hu and Davidson (Hu and Davidson, Cell 48(4), 555-566,1987) were 
the first to use lac elements to control reporter gene expression and activity in 
5 mammalian cells. They modified the bacterial GTG initiator codon of lacN to ATG 
and used the Rous sarcoma virus LTR to drive lad expression. They showed that 
mouse L-cells stably transfected with this lad expression vector produced sufficient 
lac repressor protein to control the expression and activity of an MSV-CAT reporter 
gene with lac operators inserted into the promoter. The lactose analog, isopropyl-P- 

10 D-thiogalactoside (IPTG), caused a marked de-repression of CAT activity in mouse L- 
cells demonstrating that the system was also reversible. This result was extended by 
Figge et al. (Figge et al., Cell 52(5), 713-722, 1988) to stably integrated regulatable 
reporter genes in monkey cell lines. 

It has long been recognized that a system that would allow similar 

1 5 control over the activity of genes in animals, such as the mouse, would be extremely 
useful. In 1997 it was reported that transgenes containing the bacterial coding 
sequence for the lac repressor downstream of the (J-actin promoter were heavily 
methylated and only transcribed in the testis of transgenic mice. Methylation and 
silencing in mice was also observed when the bacterial lac repressor sequence was 

20 downstream of the F9-1 polyoma promoter. In an attempt to reactivate silenced 
transgenes, the primary DNA sequence of the bacterial lad gene was changed to 
resemble a mammalian coding sequence more closely and still code for the same 
amino acid sequence. Although this iyw/ac/gene was widely transcribed, it did not 
produce a functional product. To achieve the successful transfer of a fully functional 

25 lac operator-repressor system to the mouse, extensive modifications to the synlacl 
gene were required, as described herein. 

The present invention is directed to a system that uses elements from 
the lac operon of E. coli for controlling phenotype in an animal. More particularly, 
the present invention is directed to a regulatory system for controlling the expression 

30 of recombinant genes in animals, including mammalian species. One important 
component of this regulatory system is a lac repressor transgene that expresses 
functional levels of repressor protein in the transgenic mouse. Although others have 
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attempted to utilize prokaryotic repressor proteins in eukaryotic cells to control the 
expression of genes, none have succeeded in preparing a system that uses the lac 
repressor to regulate expression in mammals in vivo. 

For example, there is a reversible system based on the tet operon of E. 
5 coli that is commercially available. However, when Gossen and Bujard adapted the 
tet system for use in mammalian cells, they converted the tet repressor into a tet 
transactivator. This conversion resulted from the fusion of the tet repressor with the 
activating domain of the herpes simplex virus VP 16 transcriptional activator, and thus 
it is necessary to permanently couple the tet operator to a viral promoter (Gossen and 

10 Bujard, PNAS 89(12), 5547-51, 1992) for the system to work. Binding of repressor to 
the operator serves only to align the VP 16 fusion partner with its specific binding site 
in the viral promoter, and it is the binding of VP 16 to the viral promoter that activates 
transcription. This dependence of the tet system on viral promoter elements limits its 
applicability in the mouse, where non-mammalian promoters very frequently lead to 

1 5 erratic expression of downstream coding sequences. Low level leakiness and 

heterogeneous expression (Redfern et al. PNAS, 97(9), 4826-31, 1999) have been 
problems with use of the minimal CMV promoter, and the VP 16 activating domain 
has been found to be toxic to cells. Furthermore, while US Patent No. 5,589,392 
discloses the use of the lac repressor in an inducible mammalian expression system, 

20 this system also uses viral promoters and fails to produce adequate levels of lac 
repressor in mice. 

The present invention describes nucleic acid constructs and methods 
that eliminate the necessity of using viral promoters or viral DNA binding proteins in 
a prokaryotic-based regulatory system. This lends the system a particularly strong 

25 element of predictability that other prokaryotic-based systems cannot match. Another 
significant advantage is that, in addition to being able to regulate a mammalian 
promoter as part of a transgene, the lac system holds the promise of endogenous gene 
regulation. By inserting lac operators into an endogenous promoter (or elsewhere in a 
gene) by homologous recombination, it should be possible to gain control over 

30 resident genes to create mouse models of disease and to elucidate gene function in 
their natural context. 



5 
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Summary of the Invention 

The present invention is directed to a novel repressor protein gene 
construct, derived from the E. coli lac repressor gene. The present invention also 
encompasses an inducible bacterial expression system and the use of that system for 
5 regulating the expression of genes in vivo in an animal. The system comprises two 
main elements, the first being a novel gene construct for expressing the lac repressor 
protein, and the second being a gene that is operably linked to an operator region. In 
this system, the repressor protein binds to the operator in the absence of the inducer 
agent to prevent transcription of the gene. Subsequent addition of the inducer causes 
1 0 the release of the repressor, allowing expression of the gene. 

Brief Description of the Drawings 

Fig. 1 . Schematic representation of the lacf transgene. The gene construct 
comprises a promoter (stippled box) operably linked to a rabbit p-globin 
1 5 intron2/exon3 sequence (open box) which is operably linked to a modified lac 

repressor coding region which is operably linked to rabbit P-globin 3* untranslated 
sequences (solid bar). The modified lac repressor coding region is made up of 
segments that are identical to the wild type bacterial sequence (crosshatched box), and 
segments that have been reencoded to use mammalian codons (striped box). 

20 

Fig. 2. Regulatable Tyr* ac0 transgene. Three lac operators have been 
introduced into the murine tyrosinase promoter. The primary operator was centered 
just downstream of the start of transcription by changing the endogenous promoter 
sequence; two additional operators were inserted 176bp and 526bp upstream. The 
25 modified promoter drives expression of the wild type murine tyrosinase cDNA. 

Fig. 3 Diagram of the /acO-promoter trap vector. The lac OCR elements are 
indicated with two vertical stippled rectangles (the operator sequences) separated by a 
black rectangle (the 150 bp stuffer). Each OCR is separated by a 400 bp fragment 
30 from the rabbit p-globin IVS2 (striped rectangle). loxP sites are indicated with ovals. 
The IRES-GFPneo cassette (crosshatched rectangle) with associated 3 'splice site (3* 
spl) and poly(A) addition site (pA) are indicated 
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Detailed Description of the Invention 
Definitions 

In describing and claiming the invention, the following terminology 
will be used in accordance with the definitions set forth below. 
5 As used herein, "nucleic acid," "DNA," and similar terms also include 

nucleic acid analogs, i.e. analogs having other than a phosphodiester backbone. For 
example, the so-called "peptide nucleic acids," which are known in the art and have 
peptide bonds instead of phosphodiester bonds in the backbone, are considered within 
the scope of the present invention. 
10 The term "peptide" encompasses a sequence of 3 or more amino acids 

wherein the amino acids are naturally occurring or synthetic (non-naturally occurring) 
amino acids. Peptide mimetics include peptides having one or more of the following 
modifications: 

1 . peptides wherein one or more of the peptidyl --C(0)NR- linkages (bonds) 
1 5 have been replaced by a non-peptidyl linkage such as a ~CH2-carbamate linkage 

(--CH20C(0)NR~), a phosphonate linkage, a -CH2-sulfonamide (-CH 2~S(0)2NR-- 
) linkage, a urea (-NHC(O)NH--) linkage, a -CH2 -secondary amine linkage, or with 
an alkylated peptidyl linkage (-C(O)NR-) wherein R is CI -C4 alkyl; 

2. peptides wherein the N-terminus is derivatized to a --NRR1 group, to a 
20 - NRC(0)R group, to a «NRC(0)OR group, to a -NRS(0)2R group, to a - 

NHC(0)NHR group where R and Rl are hydrogen or C1-C4 alkyl with the proviso 
that R and Rl are not both hydrogen; 

3. peptides wherein the C terminus is derivatized to ~C(0)R2 where R 2 is 
selected from the group consisting of C1-C4 alkoxy, and -NR3R4 where R3 and R4 

25 are independently selected from the group consisting of hydrogen and C1-C4 alkyl. 

Naturally occurring amino acid residues in peptides are abbreviated as 
recommended by the IUPAC-IUB Biochemical Nomenclature Commission as 
follows: Phenylalanine is Phe or F; Leucine is Leu or L; Isoleucine is He or I; 
Methionine is Met or M; Norleucine is Nle; Valine is Val or V; Serine is Ser or S; 

30 Proline is Pro or P; Threonine is Thr or T; Alanine is Ala or A; Tyrosine is Tyr or Y; 
Histidine is His or H; Glutamine is Gin or Q; Asparagine is Asn or N; Lysine is Lys 
or K; Aspartic Acid is Asp or D; Glutamic Acid is Glu or E; Cysteine is Cys or C; 
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Tryptophan is Trp or W; Arginine is Arg or R; Glycine is Gly or G, and X is any 
amino acid. Other naturally occurring amino acids include, by way of example, 4- 
hydroxyproline, 5-hydroxylysine, and the like. 

Synthetic or non-naturally occurring amino acids refer to amino acids 
5 which do not naturally occur in vivo but which, nevertheless, can be incorporated into 
the peptide structures described herein. The resulting "synthetic peptide" contain 
amino acids other than the 20 naturally occurring, genetically encoded amino acids at 
one, two, or more positions of the peptides. For instance, naphthylalanine can be 
substituted for trytophan to facilitate synthesis. Other synthetic amino acids that can 
1 0 be substituted into peptides include L-hydroxypropyl, L-3,4-dihydroxyphenylalanyl, 
alpha-amino acids such as L-alpha-hydroxylysyl and D-alpha-methylalanyl, L-alpha.- 
methylalanyl, beta. -amino acids, and isoquinolyl. D amino acids and non-naturally 
occurring synthetic amino acids can also be incorporated into the peptides. Other 
derivatives include replacement of the naturally occurring side chains of the 20 
1 5 genetically encoded amino acids (or any L or D amino acid) with other side chains. 

As used herein, the term "conservative amino acid substitution" are 
defined herein as exchanges within one of the following five groups: 
I. Small aliphatic, nonpolar or slightly polar residues: 
Ala, Ser, Thr, Pro, Gly; 
20 II. Polar, negatively charged residues and their amides: 

Asp, Asn, Glu, Gin; 

III. Polar, positively charged residues: 

His, Arg, Lys; 

IV. Large, aliphatic, nonpolar residues: 
25 Met, Leu, He, Val, Cys 

V. Large, aromatic residues: 

Phe, Tyr, Trp 



30 



A "polylinker" is a nucleic acid sequence that comprises a series of 
three or more different restriction endonuclease recognition sequences closely spaced 
to one another (i.e. less than 10 nucleotides between each site). 
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As used herein, the term "vector" is used in reference to nucleic acid 
molecules that have the capability of replicating autonomously in a host cell, and 
optionally may be capable of transferring DNA segment(s) from one cell to another. 
Vectors can be used to introduce foreign DNA into host cells where it can be 
5 replicated (i.e., reproduced) in large quantities. Examples of vectors include plasmids, 
cosmids, lambda phage vectors, viral vectors (such as retroviral vectors). 

A plasmid, as used herein, is a circular piece of DNA that has the 
capability of replicating autonomously in a host cell. A plasmid typically also 
includes one or more marker genes that are suitable for use in the identification and 

1 0 selection of cells transformed with the plasmid. 

As used herein a "gene" refers to the nucleic acid coding sequence as 
well as the regulatory elements necessary for the DNA sequence to be transcribed into 
messenger RNA (mRNA) and then translated into a sequence of amino acids 
characteristic of a specific polypeptide. 

1 5 A "marker" is an atom or molecule that permits the specific detection 

of a molecule comprising that marker in the presence of similar molecules without 
such a marker. Markers include, for example radioactive isotopes, antigenic 
determinants, nucleic acids available for hybridization, chromophors, fluorophors, 
chemiluminescent molecules, electrochemically detectable molecules, molecules that 

20 provide for altered fluorescence-polarization or altered light-scattering and molecules 
that allow for enhanced survival of an cell or organism (i.e. a selectable marker). A 
reporter gene is a gene that encodes for a marker. 

A promoter is a DNA sequence that directs the transcription of a DNA 
sequence, such as the nucleic acid coding sequence of a gene. Promoters can be 

25 inducible (the rate of transcription changes in response to a specific agent), tissue 

specific (expressed only in some tissues), temporal specific (expressed only at certain 
times) or constitutive (expressed in all tissues and at a constant rate of transcription). 
As used herein a eukaryotic promoter is a promoter that is isolated from an organism 
whose DNA is localized to a nucleus bounded by a membrane. A eukaryotic 

30 promoter is not a viral promoter. 

A core promoter contains essential nucleotide sequences for promoter 
function, including the TATA box and start of transcription. By this definition, a core 
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promoter may or may not have detectable activity in the absence of specific sequences 
that enhance the activity or confer tissue specific activity. 

An "enhancer" is a DNA regulatory element that can increase the 
efficiency of transcription, regardless of the distance or orientation of the enhancer 
5 relative to the start site of transcription. 

As used herein, the terms "complementary" or "complementarity" are 
used in reference to polynucleotides (i.e., a sequence of nucleotides) related by the 
base-pairing rules. For example, for the sequence "A-G-T," is complementary to the 
sequence "T-C-A." 

1 0 As used herein, the term "hybridization" is used in reference to the 

pairing of complementary nucleic acids. Hybridization and the strength of 
hybridization (i.e., the strength of the association between the nucleic acids) is 
impacted by such factors as the degree of complementarity between the nucleic acids, 
stringency of the conditions involved, the length of the formed hybrid, and the G:C 

1 5 ratio within the nucleic acids. 

As used herein, the term "purified" and like terms relate to the isolation 
of a molecule or compound in a form that is substantially free of contaminants 
normally associated with the molecule or compound in a native or natural 
environment. 

20 A "linker" is a molecule (or group of molecules) that serves to 

chemically link two disparate entities. For example a peptide linker chemically links 
two polypeptides via a peptide bond. 

As used herein, the term "repressor" and like terms refers to the 
polypeptide encoded by a nucleic acid sequence comprising the sequence of SEQ ID 

25 NO: 1 . In the absence of an inducer, the repressor binds to a nucleic acid operator 

present in a gene and inhibits transcription of the operably linked gene. Upon binding 
of the repressor to a specific inducer, the repressor disassociates from the operator to 
which it was bound thereby permitting transcription of the gene to occur. 

As used herein, the term "nuclear localization signal" and like terms 

30 refers to an amino acid residue sequence that, when present in a protein, directs 
migration of that protein to the cell's nucleus, as evidenced by accumulation of the 
protein in the nucleus after biosynthesis in the cell's cytoplasm. 
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An operator is a nucleic acid sequence that represents the binding site 
for a repressor. The repressor and operator form a system for regulating a gene that is 
operably linked to the operator, wherein binding of the repressor to the operator 
inhibits transcription of the linked gene. 
5 An inducer is a molecule, typically a low molecular weight molecule, 

that binds to the repressor of the present invention and causes the repressor to 
dissociate from an operator to which the repressor is bound. 

"Operably linked" refers to a juxtaposition wherein the components are 
configured so as to perform their usual function. Thus, promoters operably linked to a 
10 coding sequence are capable of effecting the expression of the coding sequence; and 
an operator that is operably linked to a promoter (or other gene element) is capable of 
inhibiting transcription from the linked promoter. 

As used herein the term "gene element" is intended to encompass any 
portion of a gene where one or more operator elements can be inserted, wherein the 
1 5 operator in conjunction with its corresponding repressor will reversible inhibit 

expression of the linked gene. For example the operator element(s) can be inserted 
into an intron of a gene. 

As used herein, the term "pharmaceutically acceptable carrier" 
encompasses any of the standard pharmaceutical carriers, such as a phosphate 
20 buffered saline solution, water and emulsions such as an oil/water or water/oil 
emulsion, and various types of wetting agents. 

The Invention 

Technological advances have made it possible to alter the genomes of 
25 animals, such as the mouse, and add or subtract genetic material relatively easily. As 
a result, the mouse is widely used to model mammalian development and disease. 
One recognized drawback has been the inability to control the expression of the 
altered genome experimentally once changes have been made. A reversible gene 
expression system could overcome this problem by enabling a target gene to be 
30 switched on and off, and without affecting the expression of non-targeted genes. 

Reversible systems adapted from mammalian regulatory elements, 
such as heavy metal- or hormone- responsive promoters, have the disadvantage that 
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whatever induces the targeted gene will also induce those endogenous genes that 
normally respond to it. In contrast, regulatory systems based on prokaryotic elements 
should in principle be exquisitely specific in the context of the mammalian genome. 

The present invention is directed to a system for controlling phenotype 
5 in transgenic plants and animals based on elements from the lac operon of E. coli. 
The system of the present invention comprises two elements, the first of which is a 
eukaryotic gene element that has been modified to contain lac operon elements, and 
the second component comprises the gene encoding the lac repressor. More 
particularly, the lac repressor transgene of the present invention is one that has been 

10 modified to express functional levels of the repressor protein of SEQ ID NO: 3 (or a 
sequence that differs from SEQ ID NO: 3 by 1-15, more preferably 1-3 conservative 
amino acid substitutions) in a transgenic plant or animal. 

In accordance with one embodiment, a lacl recombinant gene is 
provided that can be expressed in a mammalian cell at levels sufficient to regulate the 

1 5 expression of a second recombinant gene that has been modified to contain at least 
one copy of the lac operator. Preferably, the lacl coding sequence has the sequence of 
SEQ ID NO: 1, wherein the DNA sequence of the native lac repressor is altered to 
enable expression in a transgenic plant or animal. More particularly, the native 
bacterial sequence is modified in part to resemble the mammalian preferred codon 

20 usage while maintaining the same encoded amino acid sequence. 

It is anticipated that minor alterations to the DNA sequence of SEQ ID 
NO: 1, can be made that will retain the genes ability to express a functional repressor 
in a transgenic plant or animal. Accordingly a nucleic acid gene construct comprising 
the sequence of SEQ ID NO: 1 or sequences that differ from SEQ ID NO: 1 by 1 to 

25 100, or 1 to 50, more preferably 1 to 25 nucleotide alterations that still encode a 
functional repressor protein are within the scope of the present invention. These 
nucleotide alterations may include nucleotide deletions, insertions or substitutions of 
one nucleotide for another. Typically the nucleotide alteration is a simple transition 
from a purine to a pyrimidine or vice versa. In one embodiment a nucleic acid 

30 sequence is provided comprising the sequence of SEQ ID NO: 1 or sequences that 
differ from SEQ ID NO: 1 by 1 to 20, more preferably 1 to 5 nucleotide alterations, 
that do not alter the amino acid sequence of the encoded repressor protein. The 
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present invention also encompasses nucleic acid sequences that hybridize (under 
conditions defined herein) to. all or a portion of the nucleotide sequence represented by 
SEQ ID NO:l or its complement and encode a repressor protein that is functional in a 
transgenic plant or animal. 
5 Nucleic acid duplex or hybrid stability is expressed as the melting 

temperature or Tm, which is the temperature at which a nucleic acid duplex 
dissociates into its component single stranded DNAs. This melting temperature is 
used to define the required stringency conditions. Typically a 1% mismatch results in 
a 1°C decrease in the Tm, and the temperature of the final wash in the hybridization 

10 reaction is reduced accordingly (for example, if two sequences having > 95% identity, 
the final wash temperature is decreased from the Tm by 5°C). In practice, the change 
in Tm can be between 0.5°C and 1.5°C per 1% mismatch. 

The present invention is directed to the nucleic acid sequence of SEQ 
ID NO: 1 and nucleic acid sequences that hybridize to that sequence (or fragments 

1 5 thereof) under stringent or highly stringent conditions. In one embodiment the 
invention is directed to a purified nucleic acid sequence that encodes a functional 
repressor polypeptide (i.e. one capable of specific and reversible binding to its 
coresponding operator) that hybridizes to SEQ ID NO: 1 or its complement under 
highly stringent or stringent conditions. In accordance with the present invention 

20 highly stringent conditions are defined as conducting the hybridization and wash 
conditions at no lower than -5°C Tm. Stringent conditions are defined as involve 
hybridizing at 68°C in 5x SSC/5x Denhardfs solution/1 .0% SDS, and washing in 0.2x 
SSC/0.1% SDS at 68°C . Moderately stringent conditions include hybridizing at 
68°C in 5x SSC/5x Denhardfs solution/1.0% SDS and washing in 3x SSC/0.1% SDS 

25 at 42°C. Additional guidance regarding such conditions is readily available in the art, 
for example, by Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, 
Cold Spring Harbor Press, N.Y.; and Ausubel et al. (eds.), 1995, Current Protocols in 
Molecular Biology, (John Wiley & Sons, N.Y.) at Unit 2.10. 

As reported in the literature, simply modifying the bacterial sequence 

30 to resemble the preferred mammalian codon usage fails to produce a functional 
product in eukaryotic cells. Such a modified repressor gene (the "synlacF gene 
construct, described in Scrable and Stambrook, Genetics 147, 297-304 (1997)), 
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produced high levels of mRNA, but no functional protein. Northern blot analysis of 
total RNA hybridized to a lac repressor probe identified a single transcript in RNA 
from animals transgenic for the bacterial repressor construct, but a doublet was 
detected in RNA from animals transgenic for the synlacl gene construct. 
5 Through the use of a series of chimeric repressor gene constructs made 

by exchanging the 5' region of the wild type bacterial repressor with the corresponding 
region of the synlacl gene construct it was determined that the mRNA transcribed 
from the synlacl sequence was being improperly processed. More particularly, the 
chimeric constructs revealed that the synlacl mRN A is improperly spliced as a result 

10 of a sequence present in the first 36bp of the synlacl coding region. Within these first 
36bp of coding sequence, the wild type bacterial repressor and the synlacl gene 
constructs differ only by four bases, and the changes are in all cases simple transitions 
from a purine to a pyrimidine or vice versa. In particular, the transitions made in the 
original synlacl coding region comprise A to G at position 1 5, G to A at position 1 8, T 

15 to C at position 21 and T to C at position 27. 

The synlacl repressor gene construct was modified to incorporate the 
amino acid transitions at positions 15, 18, 21 and 27 (thus reverting the sequence back 
to the wild type bacterial sequence) however further investigation revealed that in 
addition to splicing, there was a second problem with the synlacl coding region that 

20 affected the expression of functional repressor. In particular, using a series of 

chimeric repressor gene constructs made by exchanging the 3' region of the wild type 
bacterial repressor with the corresponding region of the synlacl gene construct it was 
determined that functional lac repressor activity was being blocked by the region of 
the synlacl sequence in the dimerization domain. Accordingly, the coding sequences 

25 between the EcoRV and the PvuII restrictions sites of the synlacl construct were 
replaced with wild type bacterial lad coding sequences. This repressor encoding 
sequence, represented as SEQ ID NO: 1 is correctly spliced and translated in 
transfected eukaryotic cells. 

A gene construct comprising the nucleic acid sequence of SEQ ID NO: 

30 1 operably linked to the p-actin promoter (construct 3'C4) was used to create 

transgenic mice, but surprisingly this construct expressed the repressor only in the 
testis, resembling the expression pattern of a transgene composed entirely of bacterial 
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coding sequence. The content of CpG dinucleotides in the coding sequence is a major 
determinant of transcription in animals. In the nucleic acid construct of SEQ ID 
NO: 1 , the replacement of the synlacl sequence between the EcoRV and PvuII sites 
with the corresponding nucleic acid sequence from the bacterial lacl changed the 3' 
5 terminal region from one devoid of CpG (and ubiquitously expressed) to one that is 
CpG rich (and expressed only in the testis). However, altering the sequence to 
remove the CpG rich region leads to a gene that is transcribed but not translated as 
noted above. 

In an attempt to understand this problem, CpG-density maps were 

10 prepared for each repressor construct and aligned with the CpG-density map of the P- 
actin gene. This analysis revealed two segments downstream from the actin promoter 
that are free of CpG dinucleotides. In repressor constructs that contained CpG 
dinucleotides in a corresponding region, the expression of the repressor was limited to 
the testis. Thus to overcome this problem the repressor coding region was flanked 

1 5 with non-coding regions to move the repressor coding region farther away from the 
promoter used to express the repressor coding region. Structural repositioning of the 
3' CpGs (construct R) resulted ubiquitous expression of the repressor product in 
transgenic animals. 

Therefore, the expression of lacl from the P-actin promoter in 

20 transgenic animals depends on the density and position of CpG-rich regions in the lacl 
transgene. In summary, the overall gene construct should be prepared so that at least 
two small regions (of about lOObp in length) that lie approximately 600 and 800 bp 
downstream of the transcription start site are devoid of CpG dinucleotides. In 
accordance with one embodiment a repressor gene construct is prepared comprising a 

25 eukaryotic promoter operably linked to a eukaryotic intron that is in turn operably 
linked to the lacl coding sequence, wherein the lacl coding sequence is operably 
linked to the 3* untranslated region of a eukaryotic gene. The inclusion of the 
eukaryotic intron sequences is also believed to be important in optimizing the 
transport of the mRNA from the nucleus to the cytoplasm and subsequent translation 

30 of the repressor protein. The intron sequence used is not critical provided that it has 
the necessary spice junctions to be properly excised from the encoded mRNA. 
Furthermore, in one embodiment (when using the p-actin promoter, for example) the 
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intron provides adequate spacing so that two 100 bp regions devoid of CpG 
dinucleotides are located approximately 600 and 800 bp downstream of the 
transcription start site. In one preferred embodiment the eukaryotic promoter used to 
drive the expression of the repressor is a mammalian promoter, and the intron and the 
5 3' untranslated region of the modified repressor gene are selected from P-globin, and 
more particularly the gene construct comprises the sequence of SEQ ID NO: 4. 

One goal of the present invention is to use the repressor gene of the 
present invention to regulate the expression of other genes in vivo. Therefore, to 
obtain such regulation of eukaryotic genes it is necessary to have the expressed 

1 0 repressor transported into the cell's nucleus. In accordance with one embodiment the 
repressor encoding nucleic acid sequence of the present invention is operably linked 
to a nuclear localization signal sequence (NLS). Nucleus-targeting sequences have 
been described for a variety of proteins and typically are short amino acid residue 
sequences of about 5-15 residues. Any of the NLS sequences that have been 

1 5 previously described in the literature are suitable for use in accordance with the 

present invention and include those described by Jans et al., BioEssays, 22:532-544 
(2000), the disclosure of which is incorporated herein. 

In accordance with one embodiment the SV40 nuclear localization 
signal (NLS) is used to direct the recombinant repressor protein to the nucleus. The 

20 SV40 Large T antigen has been reported to contain a seven amino acid residue 

sequence (ProLysLysLysArgLysVal; SEQ ID No. 7) that defines a minimum region of 
the Large T antigen required for nuclear targeting (see Kalderon et al., Cell, 39:499- 
509 (1984)). The SV40-derived nuclear location signal has been engineered into 
several different proteins to cause them to accumulate in the nucleus of a cell, 

25 including bacteriophage T7 RNA polymerase into mammalian cell nuclei (Dunn et al., 
Gene, 68:259-266, 1988), and into yeast cell nuclei (Benton et al., Mol. Cell. Biol., 
10:353-360, 1990). In accordance with one embodiment the nuclear localization 
signal of SV40 is linked to the repressor encoding squence of SEQ ID NO: 1 to 
produce the nucleic acid sequence of SEQ ID NO: 2. 

30 Although the SV40 nuclear location sequence is used in one 

embodiment of the present invention, other nuclear location sequences can be utilized. 
For example, the NLS of the adenovirus El a gene product (LysArgProArgPro; SEQ 
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ID NO: 8) that is located at the extreme carboxyl terminus of El a (see Lyons et al., 
Mol. Cell. Biol., 7:2451-2456 (1987)) can be utilized. In addition, other NLS 
sequences have been identified in both higher eukaryotes and in the yeast, 
Saccharomyces cerevisiae and are suitable for use in accordance with the present 
5 invention. See, for example, the review by Silver et al., in Protein Transfer and 
Organelle Biogenesis", Das et al., eds., Academic Press, Inc., N.Y., P. 747-769 
(1988). Furthermore, assays for identifying proteins and protein regions having a 
nucleus-targeting sequence have been described. See, for example Parnaik et al., Mol. 
Cell. Biol., 10:1287-1292 (1990). 

10 The location of a nucleus-targeting sequence relative to the sequence 

encoding the recombinant repressor of this invention can vary, so long as the resultant 
protein exhibits the requisite properties. The NLS sequence is preferably located 
either at the amino or the carboxy terminus of the encoded repressor protein. In 
accordance with one embodiment the amino terminal location of a nucleus-targeting 

1 5 sequence is within about 5 amino acid residues of the amino terminus of the inducible 
lac repressor. Particularly preferred is a construct where the nucleus-targeting 
sequence begins as the second amino acid residue after the amino-terminal methionine 
encoded by the initiation codon (ATG). In the alternative embodiment wherein the 
NLS is located at the carboxy terminus of the repressor, the NLS coding sequence is 

20 located within 100 bases upstream, and more preferably 1-3 bases upstream, from the 
termination codon of the DNA segment that codes for the inducible lac repressor. In 
one embodiment the NLS sequence is linked to the repressor coding sequence through 
the use of a short nucleotide linker. 

Although fusions of the NLS to the repressor at either the extreme 5* or 

25 3' end could be made that did not adversely affect repression or induction, an ideal 
configuration utilizes a linker between laclmd the SV40 NLS. In one embodiment 
the linker comprises a three amino acid linker (Ser-Ser-Leu coded for by AGC-AGC- 
CTG) between the end of lad and the SV40 NLS. Accordingly, in one preferred 
embodiment the NLS coding sequence is operably linked by the AGC-AGC-CTG 

30 spacer oligonucleotide to the 5' terminal codon prior to a termination codon to 
generate the sequence of SEQ ID NO: 2. 
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In accordance with one embodiment, a mammalian "gene" was 
assembled from the modified lac repressor sequence with an NLS and the full-length 
human beta-actin promoter fused to the intron of a genomic fragment of the rabbit 
beta-globin gene. The lacl coding sequence was cloned in the remainder of the beta- 
5 globin fragment, which included the 3 'UTR and polyadenylation signal sequence. 
The sequence consisting of rabbit p-globin intron 2, the lacl coding sequence and the 
remainder of the beta-globin fragment including the 3'UTR is provided as SEQ ID 
NO: 4. The sequence consisting of rabbit P-globin intron 2, the lacl coding sequence, 
the remainder of the beta-globin fragment including the 3'UTR and the polyA signal 
1 0 sequence from SV40 is provided as SEQ ID NO: 1 1 . More particularly, in one 
embodiment the modified lac repressor gene construct comprises: 

1) The 4.3kb promoter region from the human P-actin promoter from the 
EcoRI site up to the AscI site 70bp upstream of the start of translation (the reverse 
complement of base pairs 3,483,536 to 3,479,221 of NT_007844, number refers to the 

1 5 Genbank accession number). 

2) rabbit P-globin intron 2 from the blunted Ncol site through the EcoRI site in 
exon 3. (bases 31528-32032 of ml 88 18) 

3) The lacl coding region was inserted at the EcoRI site: 

The first 27bp are: atg aaa cca gta acg tta tac gat gtc (SEQ 
20 ID NO: 9). It then continues as the synlacl sequence given in (Scrable and 

Stambrook, 1997) until the EcoRV site at +800 of the coding sequence. There are 
then 1 50bp identical to the wtlacl sequence, up to the PvuII site at +950 of the coding 
sequence (bases 881-1030 of j01637). There are then 129bp identical to the synlacl 
sequence (up to the NLS indicated in (Scrable and Stambrook, 1997)). A linker 
25 region and SV40 large T- Antigen NLS are attached to the 3' end of the coding region: 
age age ctg agg cct ccc aag aag aag cga aag gtg tga (SEQ ID NO: 1 0) 

4) The rabbit p-globin fragment continues downstream of the lacl coding 
region to include the rest of exon 3 and the 3 1 UTR with a polyadenylation signal 
sequence (from the EcoRI site to the PvuII site; the reverse complement of bases 

30 32033-32571 of ml 88 18). It is followed by the polyA signal sequence from SV40 
(from the Hpal site to the BamHI site; bases 2669-2539 of j02400). Following the 
SV40 sequence is a 276 bp fragment of the cloning vector pBR327 (SEQ ID NO: 12; 
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from the BamHI site a bp 375 to the Sail site at bp 65 1 ). All of the p-globin 
sequences, the S V40 polyA signal sequence, and the vector fragment are as described 
in (Katsuki et al., 1988). This gene construct can be introduced into eukaryotic cells, 
and more particularly the construct can be used to prepare transgenic animals, such as 
5 mice and other mammals, containing such a construct. 

The eukaryotic promoter used to express the repressor gene sequence 
can be selected from any of the known eukaryotic promoters, including promoters that 
are constitutive, temporally regulated or are tissue specific. The use of tissue- or cell 
type-specific promoters in conjunction with the modified lac repressor of the present 

1 0 invention will confer regional specificity on repressor expression and function. In one 
embodiment the promoter is a mammalian promoter, and more particularly a 
constitutive mammalian promoter. 

In another embodiment of the present invention, nucleic acid sequences 
encoding the modified lac repressor protein can be inserted into expression vectors 

1 5 and used to transfect cells to express the repressor protein in the target cells or to 
generate additional copies of the construct. In accordance with one embodiment, the 
nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 4 are inserted 
into an expression vector in a manner that operably links the gene sequences to the 
appropriate regulatory sequences, and the recombinant repressor is expressed in a host 

20 cell. Suitable host cells and vectors are known to those skilled in the art. 

The expression vectors contemplated by the present invention are at 
least capable of directing the replication, and preferably also expression, of a 
structural gene operatively linked to the vector. In one embodiment, a vector 
contemplated by the present invention includes a procaryotic replicon, i.e., a DNA 

25 sequence having the ability to direct autonomous replication and maintenance of the 
recombinant DNA molecule extrachromosomally in a procaryotic host cell, such as a 
bacterial host cell. Such replicons are well known in the art and include OriC. In 
addition, those embodiments that include a procaryotic replicon may also include a 
gene whose expression confers a selective advantage such as amino acid nutrient 

30 dependency or drug resistance to the transformed bacterial host cell that allows 

selection of transformed clones. Typical bacterial drug resistance genes are those that 
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confer resistance to antibiotics such as ampicillin, tetracycline, kanamycin, and the 
like. 

Expression vectors compatible with eukaryotic cells, preferably those 
compatible with cells of vertebrate or mammalian species, can also be used to form 
5 the recombinant DNA molecules of the present invention. Eukaryotic cell expression 
vectors are well known in the art and are available from several commercial sources. 
Typically, such vectors are provided containing convenient restriction sites (i.e. a 
polylinker) for insertion of the desired gene. Typical of such vectors are pSVL and 
pKSV-10 (Pharmacia), pBPV-l/pML2d (International Biotechnologies, Inc.), and 

10 pTDTl(ATCC,#31255). 

In preferred embodiments, the eukaryotic cell expression vectors used 
to construct the recombinant DNA molecules of the present invention include a 
selectable phenotypic marker that is effective in a eukaryotic cell, such as a drug 
resistance selection marker or selective marker based on nutrient dependency. For 

1 5 example, drug resistance markers suitable for use in the present invention include the 
the neomycin phosphotransferase (neo) gene. (Southern et al., J. Mol. Appl. Genet., 
1:327-341, 1982), and the hygromycin resistance gene. 

Nucleic acid sequences encoding the recombinant repressor protein 
may be introduced into a cell or cells in vitro or in vivo using standard techniques, 

20 including the use of liposomes, viral based vectors, electroporation or microinjection. 
Accordingly, one aspect of the present invention is directed to transgenic cell lines 
that comprise the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID 
NO: 4. In one embodiment a transgenic cell is provided that comprises the nucleic 
acid sequence of SEQ ID NO: 4. 

25 The present invention also encompasses gene constructs that are 

regulated by the recombinant repressor of the present invention. These gene 
constructs comprise an operator operably linked to a gene. In one preferred 
embodiment the operator is operably linked to eukaryotic promoter, wherein the 
promoter is operably linked to an open reading frame. An operator is "operably 

30 linked" to a gene/promoter when transcription of the gene is inhibited in the presence 
of the repressor and absence of the inducer, and the inhibition is reversed when the 
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inducer of the repressor is also present. In one preferred embodiment the eukaryotic 
promoter is a mammalian promoter. 

The lac operator is a short (~20bp) DNA sequence that can be 
synthesized with flanking ends to allow it to be inserted into an available restriction 
5 site, or used with the polymerase chain reaction to replace a segment of DNA to 

convert a mammalian promoter into a regulatable version. This has been done for the 
murine H-2K b promoter, the human serine tRNA promoter, and the murine PGK 
promoter, as previously reported. 

The present invention also encompasses nucleic acid constructs 

1 0 wherein an operator is operably linked to a eukaryotic promoter and the promoter is 
operably linked to a polylinker. This nucleic acid construct can be utilized to 
conveniently insert the coding region of gene so that the transcription of the gene can 
be regulated by a repressor protein interacting with the operator. In one preferred 
embodiment the eukaryotic promoter is a mammalian promoter. 

1 5 Operators function to control the expression of a gene by a variety of 

mechanisms. The operator can be positioned within a promoter such that the binding 
of the repressor covers the promoter's binding site for RNA polymerase, thereby 
precluding access of the RNA polymerase to the promoter binding site. Alternatively, 
the operator can be positioned downstream from the promoter binding site, thereby 

20 blocking the movement of RNA polymerase down through the transcriptional unit. In 
particular, it has been demonstrated, first in E. coli and later in rabbit kidney cells 
(Deuschle et al.,Science, 248(4954), 480-483, (1990)) expressing lac repressor, that a 
single operator inserted into the middle of a transcription unit could interrupt 
polymerase and cause premature termination of nascent RNA molecules. Thus it is 

25 anticipated that operators inserted into introns (or other gene elements) of mammalian 
genes will function as transcription terminators, which would alter the mechanism but 
not the outcome of gene regulation by lac in mammalian cells and animals. 

Multiple operators can be inserted into a gene construct to bind more 
than one repressor. The advantage of multiple operators is several fold. First, tighter 

30 blockage of RNA polymerase binding or translocation down the gene can be effected. 
Second, when spaced apart by at least about 70 nucleotides and typically no more than 
about 1000 nucleotides, and preferably spaced by about 200 to 500 nucleotides, a loop 
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can be formed in the nucleic acid by the interaction between a repressor protein bound 
to the two operator sites. The loop structure formed provides strong inhibition of 
RNA polymerase interaction with the promoter, if the promoter is present in the loop, 
and provides inhibition of translocation of RNA polymerase down the transcriptional 
5 unit if the loop is located downstream from the promoter. 

There are two considerations that should be taken into account when 
modifying a eukaryotic promoter for use in a transgenic animal: where to position the 
operators in the promoter, and selecting the operator sequence to use. Operator 
position is important not only to achieve optimum repression, but also to minimize the 

10 effect of promoter modification on basal promoter activity. Operator sequence is 
important to ensure that induction of promoter activity is as successful as repression. 
Typically the promoter will be modified to include an operator a few base pairs 
upstream of a transcription start site and a second operator identical in sequence to the 
first operator approximately 93 base pairs downstream from the first operator. In one 

1 5 embodiment the operators have the sequence of SEQ ID NO: 6 and the first operator 
is located approximately 1-3 base pairs upstream of the transcription start site. 

In E. coli, the transcription start site (tss) of the regulatable lac 
promoter is flanked by the primary operator just downstream of the start site (O,) and 
a secondary operator 0 2 , located 93bp upstream. Selective pressure over eons of time 

20 appears to have positioned these two operators in a nearly perfect physical relationship 
to each other, as an optimum distance for repression has been found experimentally to 
be 92.5bp. (Interestingly, maximal repression was also obtained experimentally at an 
operator spacing of 70.5bp and at 1 1 5.5, the natural operator spacing in the gal 
operon). Experiments in E. coli have also demonstrated that repression by O, at its 

25 natural position increases up to 50-fold in the presence of an optimally positioned 
auxiliary operator, which can be attributed to stable DNA loop formation. A third 
operator (OJ lies within the coding sequence of the beta-galactosidase gene 401 bp 
downstream of O,. 

In preparing the tyrosinase promoter for regulation (see Example 1 ), 

30 much of the original architecture of the bacterial promoter was replicated. Operators 
were inserted 9bp downstream (O,) and 1 76bp upstream (0 2 ) of the tss so that they 
flanked the tss in the promoter. The addition of the beta-galactosidase gene 
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downstream brought the auxiliary operator in the lacZ coding sequence (O J into a 
position relative to the operators in the promoter that mimicked the spacing of that in 
the lac operon. This configuration was tested in vitro and found to give tight 
regulation when co-transfected with lacIDNA and grown in the presence or absence 
5 oflPTG. 

The auxiliary operator in lacZ was eliminated by replacing the beta- 
galatosidase gene with the gene encoding the A-chain of diphtheria toxin, and a new 
third operator was inserted 500 bp upstream of 0 2 . Stringency of regulation was 
assayed by counting the number of dead cells that resulted from induction of the toxin 

1 0 by IPTG. Moving the third operator from a position downstream to a position 

upstream did not appear to attenuate repression, as there was the same low number of 
dead cells in untransfected cells as in cells co-transfected with the toxin gene and the 
lac repressor. Thus, the third operator may function in the context of a regulatable 
mammalian promoter in much the same way it does in the bacterial operon, where it 

15 serves to sequester excess repressor molecules in close proximity to sites where they 
are actively being used. 

In an attempt to avoid having to flank the tss with operators, primary 
and secondary operators were both placed downstream of tss to generate a regulatable 
version of the murine p53 promoter. In conjunction with the operator in lacZ, this 

20 resulted in about 90% repression in co-transfection assays with lad and IPTG. 
Attempts to simplify the modifications even further by using only one upstream 
operator (O, or 0 2 ) with 0 2 , or with tandem repeats of two or three primary operators 
with O z , were less successful. These manipulations with the p53 promoter show that 
it is possible to achieve repression without having to flank the tss with operators if one 

25 is willing to sacrifice some degree of control. This may be a viable compromise in 
situations when total repression of gene expression is not required. 

Another solution to flanking operators would be to place operators in 
positions upstream of tss only. One of the most tightly /acOregulated promoters 
known is the modified SV40 immediate early promoter constructed by Figge et al. A 

30 single operator was inserted between tss and the TATA box, creating a new tss the 
same distance from TATA as the original tss had been. This single operator confers 
virtually complete repression on the promoter in the presence of lacL When lacZ was 
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replaced by CAT (which simultaneously eliminated OJ, the same level of repression 
was obtained. This strategy was also used to modify the H-2K b promoter from the 
MHC locus of the mouse. A single operator positioned between ts$ and the TATA 
box conferred regulation on 80% of mouse L-cell clones stably transfected with lad 
5 and an H-2K b lacZ reporter gene. At least two of the clones exhibiting tightly 

regulated P-galactosidase expression contained only a single copy of lacl. Finally, as 
noted above a single operator inserted into the middle of a transcription unit could 
interrupt polymerase and cause premature termination of nascent RNA molecules. 

Any of the operator sequences known to those skilled in the art are 

1 0 suitable for use in the present invention. There are two sequence variants of the lac 
operator that have been used in experimental systems. The first, referred to as the 
wild-type sequence, is the sequence found at the primary operator site (0,) in the 
regulatable promoter of the bacterial operon. The sequence is an imperfect 
palindrome whose mirror image reflects about a central unpaired guanine. The second 

1 5 operator sequence is an "ideal" version of the first in which mismatched bases have 
been replaced to create a perfect palindome, and the central unmatched base has been 
removed. No obvious significant difference in the efficacy of these two operator 
sequences has been detected. The wildtype-type sequence, with its mismatches, is 
less likely to self-anneal and for that reason may be easier to handle in the lab. 

20 In accordance with one embodiment two optimized operators derived 

from the lac operon and having the sequence ATTGTGAGCGCTCACAAT (SEQ ID 
NO: 6) or TGTGGAATTGTGAGCGCTCACAATTCCACA (SEQ ID NO: 5) are use 
in accordance with the present invention. A comparison of these two operators has 
been conducted in mammalian cells. Each of these two operators were inserted into 

25 the Pol III promoter of a human serine amber suppressor tRNA (Su + tRNA) gene at 
the -1 position. Suppressor activity in mammalian cells was measured as a function of 
the ability of Su + tRNA to suppress the UAG nonsense codon in a CAT reporter gene 
co-transfected with lacL With the 18bp operator it was possible to reduce suppressor 
activity by 75-85%, but with the 30bp operator activity was reduced by 98%. 

30 The present invention also encompasses a pack or kit comprising two 

gene constructs for preparing transgenic mammals for in vivo regulation of gene 
expression. The first construct comprises a eukaryotic promoter linked to the 
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modified repressor gene of the present invention. In one embodiment, the first 
construct comprises an intron region linked to the lad coding sequence of SEQ ID 
NO: 2, which is in turn linked to the 3 1 untranslated region of a eukaryotic gene. The 
intron is operably linked to the lad coding sequence, and thus is properly excised 
5 from the mRNA prior to translation of the mRNA. In one preferred embodiment the 
first construct comprises the sequence of SEQ ID NO: 4 operably linked to a 
eukaryotic promoter. The second gene construct comprises a eukaryotic promoter that 
has been modified to incorporate one or more lac operators. Furthermore, the 
modified eukaryotic promoter of the second construct is operably linked to either the 
10 coding sequence of a protein or to a polylinker (i.e. a nucleic acid region containing 
multiple restriction endonucleases in close proximity). In accordance with one 
embodiment the eukaryotic promoters of the two constructs are mammalian 
promoters. 

The two constructs of the kit can be packaged in a variety of 

1 5 containers, e.g., vials, tubes, microtiter well plates, bottles, and the like. Other 
reagents can be included in separate containers and provided with the kit; e.g., 
positive control samples, negative control samples, buffers, cell culture media, etc. 
Preferably, the kits will also include instructions for use. 

One aspect of the present invention is directed to non-human 

20 transgenic animals that comprise the nucleic acid constructs of the present invention. 
A transgenic plant or animal in accordance with the present invention has at least 1 
cell containing a gene construct of the present invention. In preferred embodiments 
all the cells of the transgenic plant or animal comprise one or more transgenes of the 
present invention inserted into the cell's genome. A transgene is a DNA sequence 

25 integrated at a locus of a genome, wherein the transgenic DNA sequence is not 

otherwise normally found at that locus in that genome. Transgenes may be made up 
of heterologous DNA sequences (sequences normally found in the genome of other 
species) or homologous DNA sequences (sequences derived from the genome of the 
same species). The transgenic organisms encompassed by the present invention 

30 include any of the multicellular eukaryotic organisms that undergo sexual 

reproduction by union of gamete cells. Preferred organisms include mammals, birds, 
fish (i.e. zebrafish), amphibians (i.e. frogs), and plants, including both gymnosperms 
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and angiosperms. In one preferred embodiment the transgenic animal is a non-human 
mammal, including but not limited to sheep, cows, pigs, horses, rabbits, primates and 
rodents, such as mice or rats, and the like. 

One embodiment of the present invention is directed to transgenic mice 
5 that comprise a nucleic acid sequence comprising a mammalian promoter operably 
linked to rabbit (J-globin intron 2, which is operably linked to the lad coding region, 
which is linked to the 3' untranslated region of the rabbit 0-globin gene. This 
construct allows for the expression of lac repressor in amounts sufficient to inhibit 
gene constructs that contain one or more copies of the lac operator in the 5' end of the 

1 0 gene (i.e. near the transcriptional start site of the gene). 

In one embodiment a non-human transgenic mammal is provided 
wherein the cells of the mammal comprise a repressor transgene that is stably 
integrated in its genome. The repressor transgene comprises the nucleic acid sequence 
of SEQ ID NO: 4 operably linked to a eukaryotic promoter. In another embodiment a 

1 5 non-human transgenic mammal is provided wherein the cells of the mammal comprise 
an operator (capable of interacting with the repressor encoded by SEQ ID NO: 4) 
operably linked to a promoter (or some other gene element), wherein the promoter is 
operably linked to a sequence that encodes a protein. In one preferred embodiment 
the non-human transgenic mammal's cells comprise both the repressor transgene as 

20 well as a second gene that comprises a eukaryotic promoter, modified to incorporate 
one or more lac operators, operably linked to the coding sequence of a protein. Thus 
the expression of the second recombinant gene construct can be regulated by 
administering lactose or a lactose analog, such as IPTG, to the mouse. Transgenic 
animals that comprise both gene constructs can be prepared by crossing the two 

25 respective transgenic mammals to produce a progeny transgenic mammal containing 
the transgene of each parent transgenic mammal. The procedure generally involves 
mating male and female transgenic mammals (founders) to produce offspring, at least 
some of which will be transgenic mammals containing the transgenes of both parents, 
i.e., a hybrid transgenic mammal. 

30 The transgenic animals of the present invention can be produced using 

methods well known in the art. See for example, Wagner et al., U.S. Pat. No. 
4,873,191 (Oct. 10, 1989); Hogan et al., Manipulating the Mouse Embryo: A 
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Laboratory Manual, Cold Springs Harbor, N.Y. (1987); Capecchi, Science, 244:288- 
292 (1989); and Luskin et ah, Neuron 1:635-647 (1988). One technique for 
transgenically altering a mammal is to microinject a gene construct into the male 
pronucleus of the fertilized mammalian egg to cause one or more copies of the gene 
5 construct to be retained in the cells of the developing mammal. The gene construct is 
isolated in a linear form with most of the sequences used for replication in a host cell 
removed. Linearization and removal of excess vector sequences results in a greater 
efficiency in production of transgenic mammals. See for example, Brinster, et al., 
Proc. Nad. Acad. Sci., USA, 82:4438-4442 (1985). Usually up to 40 percent of the 

1 0 mammals developing from the injected eggs contain at least 1 copy of the 

recombinant DNA in their tissues. These transgenic mammals usually transmit the 
gene through the germ line to the next generation. The progeny of the transgenically 
manipulated embryos may be tested for the presence of the construct by Southern blot 
analysis of a segment of tissue. For example, a small part of the tail of the animal is 

1 5 used for this purpose. The stable integration of the rDNA into the genome of the 

transgenic embryos allows permanent transgenic mammal lines carrying the rDNA to 
be established. An exemplary preparation of a transgenic mouse is provided in the 
Examples. 

Alternative methods for producing a non-human mammal containing 
20 one of the gene constructs of the present invention include infection of fertilized eggs, 
embryo-derived stem cells, totipotent embryonal carcinoma (Ec) cells, or early 
cleavage embryos with viral expression vectors containing the gene construct. See for 
example, Palmiter et al., Ann. Rev. Genet., 20:465-499 (1986) and Capecchi, Science, 
244:1288-1292 (1989). The infection of cells within an animal using a replication 
25 incompetent retroviral vector has been described by Luskin et al., Neuron, 1 :635-647 
(1988). The frequency of obtaining transgenic animals by retroviral infection of 
embryos can be as high as that obtained by microinjection of the rDNA and appears to 
depend greatly on the titre of virus used. See, for example, van der Putten et al., Proc. 
Natl. Acad. Sci., USA, 82:6148-6152 (1985). 
30 Another method of transferring new genetic information into the 

mouse embryo involves the introduction of the gene construct into embryonic stem 
cells (usually by electroporation) and then introducing the embryonic stem cells into 
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the embryo. The embryonic stem cells can be derived from normal blastocysts and 
these cells have been shown to colonize the germ line regularly and the somatic 
tissues when introduced into the embryo. See, for example, Bradley et al, Nature, 
309:255-256(1984). 

5 A transgene containing an operator sequence can be regulated in 

accordance with the present invention in a transgenic animal by supplying or 
removing the inducing agent. For example, to induce the expression of a transgene 
that is suppressed by the operator/repressor system of the present invention, a cell 
within the organism that contains the relevant trangene is contacted with an effective 

1 0 amount of inducer for a time period sufficient for the inducer to be taken up by the 
cell and for the inducer to bind the repressor. The repressor dissociates from the 
operator, and the gene is expressed within that cell. 

An effective amount of inducer is an amount sufficient to bind 
repressor and derepress the operator-regulated reporter gene, thereby causing 

1 5 expression of the reporter gene product in the contacted cell. Preferred amounts of 
inducer effective to bind repressor and derepress the regulated gene depend on degree 
and extent of derepression desired. Typically, an effective amount of inducer to be 
contacted with a cell to be regulated is in the range of 10 picomolar (pM) to 500 
millimolar (mM), preferably about 1 raM to 200 mM, and more preferably about 50 

20 mM. Thus in one embodiment, to induce a transgene in a transgenic animal, an 
inducer is administered to the animal in an amount sufficient to produce a blood 
concentration having an effective amount of inducer. 

The inducer can be administered to the transgenic animal by a variety 
of means to deliver the inducer to the cell (i.e., contact the cell) containing the 

25 eukaryotic gene regulation system to be induced, and depends in part on the cell type 
to be induced and tissue in which the cell is located in the organism. Administration 
can be topical, oral, as by ingestion, intravenous, intramuscular, intradermal or 
intraperitoneal, and can be accomplished by a single dose, by repeated doses, or by 
continuous infusion. In one embodiment, continuous infusion is obtained through the 

30 use of an implantable osmotic pump. One preferred route of adminstration is orally, 
including for example, placing the inducer in the animal's food or water. Repression 
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of the gene product can be reestablished simply by ceasing the adminstration of the 
inducing agent. 

The inducer used in the present invention is a molecule, typically a low 
molecular weight molecule, that binds to the lac repressor polypeptide of the present 
5 invention and causes the repressor to dissociate from a nucleic acid operator sequence 
to which it is bound. More particularly, the lac repressor is induced by a class of 
galactoside derivatives that are exemplary of inducers for the present invention. See, 
for example Miller, J. H., in "The Operon", p. 31-88, 34, Miller et al., eds., Cold 
Spring Harbor Laboratory, New York, 1980; and Jacob et al., J. Mol. Biol., 3:318- 

10 356,324(1961). 

In one embodiment, lac repressor inducers of the present invention are 
derivatives of galactoside that are modified to increase the half-life of the derivative in 
physiological solutions. Preferred modified galactosides are thiogalactoside 
derivatives such as the prototype isopropyl-beta-D-thiogalactoside (IPTG). Modified 

1 5 thiogalactosides that are selectively taken up in specific tissues of all animal are 

described in US Patent No. 5,589392, the disclosure of which is incorporated herein. 
In particular, by careful selection of a modified thiogalactosides, one can direct the 
uptake, and therefore the induction, to specific tissues or cell types based on the 
properties of the modifiedthiogalactoside. 

20 As described in detail in the Examples the present regulation system 

can be used to control the expression of a gene in a transgenic animal. The lac 
repressor was demonstrated to effectively regulate pigmentation in the mouse by 
controlling the activity of the murine tyrosinase promoter into which lac operators 
were inserted to control the expression of a visible marker, tyrosinase. Regulation 

25 was also determined to be fully reversible. In addition to the tyrosinase promoter, the 
promoters of the human Huntington's disease gene locus and the murine Arc gene 
have also been modified to insert the functional operator sequence of SEQ ID NO: 6. 

Expression of the modified genes can be switched on and off easily 
during embryogenesis and in the adult mouse by supplementing drinking water with a 

30 low concentration of IPTG. In accordance with one embodiment the drinking water of 
the transgenic animal was replaced with 10-12.5 mM IPTG in light-protected water 
bottles to induce expression. Expression of reporter genes can also be induced in vivo 
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by intraperitoneal injection of IPTG. In an exhaustive study on the pharmacokinetics 
of potential lac inducers in mammalian cells and animals, IPTG was rapidly taken up 
by facilitated transport into the tissues of the animal, where it reached high levels in 
cells in 2-4 hours. Nuclear uptake of inducer averaged 1 8% of the total cell uptake, 
5 estimated to be a 1000-fold higher relative concentration of inducer molecules to 
repressor molecules than is required for maximal induction in E. colL Tissue 
distribution in the adult animal was widespread (spleen, liver, lung, kidney, brain, and 
adipose tissue), and based on the results described in Example 2, IPTG can cross the 
placenta to induce gene expression in embryos. Tissues were found to have a large 

1 0 capacity for inducer uptake, but it was rapidly cleared from the blood, which allowed 
cells to survive the initial high doses that were used to achieve maximal uptake. The 
synthetic sugar was not metabolized in the animal and remained functionally active 
for at least 4 hours after introduction into the bloodstream. Gene expression can then 
be switched off by removing IPTG. 

1 5 The lac operator-repressor system brings the temporal dimension of 

mammalian gene expression in the animal under experimental control that is both 
reliable and predictable, and should make it possible to introduce even lethal 
mutations into the mouse genome routinely and to study them at the organismal level. 
Elimination of the requirement for any viral elements, furthermore, should result in 

20 improved reliability, predictability, and consistency relative to other available 
systems. For these reasons, the system should prove to be of general utility for 
introducing lethal mutations and creating true phenocopies of genetic diseases in the 
laboratory mouse. 

The regulatory control of exogenous genes in vivo or in vitro provides 

25 a wide variety of commercial and research applications. Transgenic animals 
containing an exogenously-added regulatable gene provide a research tool to 
investigate the control of eukaryotic genes, allow the preparation of animals with 
altered growth characteristics, allow the development of animal models for human 
disease gene therapy, and provides a system to study developmental genes and 

30 tumorigenesis. Inducible expression systems based on prokaryotic elements are 
particularly useful because they allow for precise regulation of the exogenous gene 
without altering the expression of the other genes present in a cell. 
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The lac operator-repressor system of the present invention is used in 
accordance with one embodiment to regulate both genes that are introduced 
experimentally into the resident genome and genes that are already there. In 
accordance with one embodiment endogenous genes are targeted for the insertion of 
5 operators to regulate the expression of the targeted endogenous gene in vivo. More 
particularly, the present invention encompasses transgenic animals that comprise an 
endogenous gene having one or more operators inserted into a gene element of the 
gene. The operator sequences can be inserted into the endogenous genes using any of 
the standard techniques for introducing gene constructs and inserting the genes into 

1 0 the genome of the cell. In one embodiment the introduced operator constructs are 
flanked with sequences homologous to the endogenous gene and the operator is 
inserted into the gene through the use of homologous recombination. The operator 
sequences can be inserted at any non-coding site of the gene including the promoter, 
introns and 5' and 3' untranslated regions of the gene, with one preferred site being the 

15 intron regions. 

In one embodiment a method of regulating the expression of a gene in 
a transgenic animal is provided. The method comprises providing a transgenic animal 
wherein the cells of said animal comprise a first nucleic acid sequence comprising the 
sequence of SEQ ID NO: 4, and a second nucleic acid sequence comprising an 

20 operator operably linked to said gene, and contacting the cells of the transgenic 
animal in vivo with an inducer of the repressor. In one embodiment the transgenic 
animal is created by first introducing and inserting into the genome of the animal a 
DNA construct comprising an operator sequence. In one embodiment the introduced 
operator sequence is inserted into an endogenous gene to operably link the operator to 

25 the endogenous gene. In an alternative embodiment the introduced DNA construct 
further comprises a gene that is operably linked to the operator and the gene is 
inserted into the genome. 

In accordance with one embodiment of the present invention an 
operator targeting vector is provided that is designed for inserting operators into the 

30 introns of endogenous genes. The vector construct comprises an operator and a 
reporter gene construct, wherein the reporter gene construct is flanked by direct 
repeats of a site-specific recombinase site. Stating that the reporter construct is 
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flanked means that the target sites may be directly contiguous with the reporter gene 
or there may be one or more intervening sequences present between one or both ends 
of the reporter gene and the target sites. The reporter gene construct further comprises 
a consensus 3' spice site upstream of the reported gene. In one embodiment the 
5 operator targeting construct comprises one or more OCR element that comprise two 
lac operator sequences separated by 150 or 200 bp of spacer nucleotides. In 
constructs containing multiple OCR elements, the elements are each separated by 400 
bp of spacer nucleotides. The sequence of the spacer nucleotides is not critical, 
provided that it gives the desired spacing. The construct can also include sequences 

10 homologous to the target endogenous gene to allow for homologous recombination. 

The site-specific recombinase sites and the corresponding site-specific 
recombinase used in the present invention may include any enzyme system wherein 
the enzyme is capable of being functionally expressed in eukaryotic cells, and 
catalyzes conservative site-specific recombination between its corresponding target 

1 5 sites. For reviews of site-specific recombinases, see Sauer (1994) Current Opinion in 
Biotechnology 5:521-527; Sadowski (1993) FASEB 7:760-767; the contents of which 
are incorporated herein by reference. Methods of using site-specific recombination 
systems to excise DNA fragments from chromosomal or extrachromosomal plant 
DNA are known to those skilled in the art. The bacteriophage PI loxP-Cre and the 

20 Saccharomyces plasmid FRT/FLP site-specific recombinations systems have been 
extensively studied. For example, Russell et al. (1992, Mol. Gen. Genet. 234:49-59) 
describe the excision of selectable markers from tobacco and Arabidopsis genomes 
using the loxP-Cre site-specific recombination system. 

The reported gene can be any gene sequence that encodes a detectable 

25 marker. Preferred markers include selectable markers (such as antibiotic resistant 
genes) and fluorescent markers. A 3' acceptor splice sequence is provided upstream 
of the reported gene. Consensus splice sequences are well know to those skilled in the 
art and include those described in US Patent No: 5,744,326, the disclosure of which is 
incorporated herein. In one embodiment the 3' acceptor splice site comprises a series 

30 of pyrimidines followed by AG. 

In one embodiment the targeting construct comprises a lac OCR 
element (two lac operators spaced 150bp apart) separated by a 400 bp spacer 
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sequence, comprising the rabbit (i-globin second intron, followed by another lac OCR 
element (See Fig. 3). Immediately 3 '(downstream) of the lacO elements is a loxP- 
flanked cassette consisting of a 3 'splice site and an internal ribosome entry site (IRES, 
for translation initiation) linked to a GFPneo fusion sequence with its own poly(A) 
5 addition sequence (See Fig. 3). This insertion vector is designed for random 

mutagenesis of endogenous genes. Multiple lacO binding sites ensures that operator- 
bound lac repressor will be able to block transcription elongation, while the 3 'splice 
site is designed for trapping the construct within an intron (splicing should occur 
between the 5 'splice site of the intron and the 3 'splice site provided by the construct). 

10 Thus only when the construct is inserted into an intron will the marker gene undergo 
post-transcriptional modification and become operably linked to the coding region of 
endogenous gene 

In one embodiment the marker gene comprises a GFPneo cassette. 
This cassette includes a GFP reporter that is sensitive to incorporation into an active 

1 5 transcription unit, and is also a selectable marker for positive selection of transfected 
ES clones with G418. The reporter also allows characterization of randomly targeted 
ES cell clones for their ability to be regulated by lac repressor. In the presence of lac 
repressor, expression of the GFPneo cassette will be suppressed, while in the presence 
of IPTG, removal of the lac repressor-mediated block of transcription elongation 

20 should result in GFPneo expression. The loxP sites allow Cre-mediated excision of 
the reporter sequences. This is necessary so that expression from the /acO-targeted 
gene in the absence of lac repressor is not truncated at the poIyA site associated with 
the GFPneo cassette. 

The operator targeting construct of the present invention can be 

25 formulated as part of a kit that is used to produce transgenic organisms. In one 

embodiment the kit comprises two gene constructs for preparing transgenic mammals 
for in vivo regulation of gene expression. The first construct comprises a eukaryotic 
promoter linked to the modified repressor gene of the present invention and the 
second construct comprises the operator targeting construct of the present invention. 

30 In particular, the operator targeting construct comprises an operator sequence operably 
linked to a reporter gene construct, wherein the reporter gene construct comprises a 3' 
splice acceptor sequence and a reporter gene, and the reporter gene construct is 
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flanked at either end by direct repeats of a site specific recombinase target sequence. 
In one preferred embodiment the operator targeting construct has the structure shown 
in Fig. 3, wherein the operator sequence is selected from the group consisting of SEQ 
ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 19 or SEQ ID NO: 20. 
5 In accordance with the present invention a transgenic animal can be 

prepared using the operator targeting construct of the present invention. The method 
comprises the steps of introducing the targeting construct into the cell of the plant or 
animal using standard transgenic techniques, identifying those plants or animals that 
are expressing the reporter gene, and introducing site-specific recombinase activity to 

1 0 remove the reporter gene cassette. In one embodiment the site-specific recombinase 
activity is introduced by inducing the expression of a recombinase gene already 
present in the animal or plant. Alternatively the recombinase activity can be 
introduced into the plant/animals progeny by crossing the original transgenic (or it 
progeny) with a transgenic line that constitutively expresses recombinase activity in 

1 5 its cells. The resultant transgenic organisms, comprising a targeted insertion of lacO 
elements within an intron should be capable of conferring multiple rounds of both 
gene repair and inactivation under the control of the lac repressor. 

A transgenic animal comprising a gene operably linked to an operator 
and a repressor gene construct (a "double transgenic") is then created by introducing a 

20 repressor encoding nucleic acid sequence into an animal (or its progeny) that 
comprises a gene operably linked to an operator. In one embodiment the step of 
creating the double transgenic animal comprises mating a transgenic animal 
comprising the operator-containing endogenous gene with a transgenic animal that 
comprises a repressor gene construct of the present invention. In one embodiment the 

25 repressor gene construct comprises a eukaryotic promoter operably linked to the 
sequence of SEQ ID NO: 4. 

The creation of conditional alleles and transgenes that can be switched 
on and off reversibly will enable studies relating to the reversibility of the disease 
processes. Such models could become important tools for evaluating the 

30 consequences of silencing or reactivating the expression of normal or mutant genes on 
disease progression, and for determining the efficacy of potential therapeutic 
strategies that can be applied even after overt symptoms of a disease have developed. 
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Example 1 

Target gene activity is controlled by Lad and IPTG in the mouse 

The lac operator-repressor system of the present invention was tested 
in mice using a regulatable version of a well-characterized visible marker gene, 
5 tyrosinase. Tyrosinase is the protein product of the albino (c) locus (Kwon et al., 
PNAS 84, 7473-7477 (1 987)), and is the enzyme that catalyzes the first step in 
melanin biosynthesis. The target transgene consists of the wildtype murine tyrosinase 
cDNA under the control of the murine tyrosinase promoter modified to contain lac 
operator sequences (See Fig. 2). The major transcription start site in the tyrosinase 

10 promoter is 83 bp upstream of the start codon. To maintain the endogenous spacing 
of promoter elements in the critical region between the start of transcription and the 
start of translation, a PCR-based, site-directed mutagenesis was used to change 25 bp 
of the endogenous sequence to create a primary lac operator centered at 59 bp 
upstream of the start of translation. Additional operators were inserted 176 bp and 

1 5 526 bp upstream of the primary operator (Fig. 2). Mice containing this modified 

Tyrosinase transgene resemble pigmented animals previously described (Methot et al., 
nucleic Acids Research, 23, 4551-4556 (1995)) that had been microinjected with an 
unregulatable version of the same transgene. Two lines of pigmented Tyrosinase 
transgenic mice were established containing the regulatable transgene. The Tyr ,ac0 

20 (25) line displays a himalayan pigmentation pattern, and the Tyr 1 * 00 (43) displays a 
light pigmentation pattern, similar to those described in Methot et al.(1995). Mice 
transgenic for the Tyrosinase transgene were crossed to mice transgenic for LacL 
In double transgenics, the lac repressor should bind to the operator 
sequences located in the tyrosinase promoter, block transcription of tyrosinase, and 

25 revert pigmented animals to albino. This was in fact observed in the double 
transgenic mice. The coat of the double transgenic is unpigmented and 
indistinguishable from that of a nontransgenic albino. Treatment of a double 
transgenic animal with 10 mM IPTG in the drinking water derepressed tyrosinase 
expression, resulting in a phenotype indistinguishable from that of the mouse 

30 transgenic for Tyr^ construct alone. The stringency of repression and derepression 
was evident from observation of the pigmentation of the eye. Comparison of sections 
through the eyes of a nontransgenic albino, a Tyr transgenic, a Tyr, LacI double 
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transgenic, and a Tyr, Lad double transgenic mouse treated with IPTG revealed that 
repression of the target tyrosinase transgene expression is accompanied by an absence 
of melanin in the retinal pigment epithelium (RPE) of the double-transgenic animal. 
The entire RPE is devoid of melanin, as can be seen by the completely unpigmented 
5 appearance of the eye in whole mount. Derepression by IPTG is accompanied by a 
restoration of pigmentation in the RPE to levels indistinguishable from the 
nonrepressed state. 

Similar results were obtained with both lines of regulatable Tyrosinase 
mice. This indicates that regulation is neither insertion site specific nor simply 

10 fortuitous, but rather controlled by the lac repressor acting specifically on a target 

gene with lac operator sequences in its promoter. The albino mutation is a single base 
pair change in the coding sequence of tyrosinase, which causes a single amino acid 
change in the protein. Because the mutant allele is both transcribed and translated, 
promoter activity has not been assayed quantitatively at the molecular level. 

1 5 Nevertheless, one can infer from its effect on pigmentation that the tyrosinase 
promoter is in fact regulated by the lac operator-repressor system tightly, in a 
biologically relevant manner. 

These results also show that IPTG can be introduced into the drinking 
water and circulate in the mouse at a level sufficient to derepress target gene 

20 expression. This level appears to be completely nontoxic. Tyr 1300 , Lad double- 
transgenic mice have been administered 10 mM IPTG in their drinking water for 
up to 8 months with no deleterious effects. 



Materials and methods 

25 Construction of lac repressor genes 

The lac repressor constructs W, S, 5'C1, 5'C2, 5'C4, 3'C1, 3'C2, 3'C3, 
and 3'C4 are driven by a 4.3-kb promoter region from the human p-actin gene from 
the Eco RI site up to the Alul site at -7 (Leavitt et al.1984). Followed by a short linker 
of either 44 bp (gatcagtcga cctgcagcccaagcttgata tcgaattcgg atct; SEQ ID NO: 13) in 

30 W, 5'C1 , 5*C4, 3'C1 , 3'C2, 3'C3, and 3'C4 or 30 bp (gatcagtcga cctgcagccc aagcttcacc; 
SEQ ID NO: 14) in S and 5'C2. 5'C3 contains the 4.3-kb promoter up to the start of 
translation with no polylinker. All of the above constructs contain the 
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polyadenylation signal sequence from the bovine growth hormone gene (Woychik et 
al. 1 982) connected to the 3' end of the construct by a Bam HI and Eco RI linker region 
(taggatccccgggctgcagg aattc; SEQ ID NO: 15). 

Coding regions for the original wtlacl (W) and synlacl (S) constructs 
5 are as previously described (Scrable and Stambrook 1997). 5'C1 and 5'C2 were made 
by switching the linker region and the first 36 bp of the coding region between wtlacl 
and synlacl using the BsrFl site shared by both constructs. 5'C1 contains the wtlacl 
linker and first 36 bp of the coding region, and then the synlacl coding region. The 
nuclear localization signal sequence (NLS) that had been attached to the synlacl 

10 coding sequence was removed by PCR mutagenesis, so that 5 'CI codes for a protein 
identical in amino-acid sequence to the endogenous lac repressor. 5'C2 contains the 
synlacl linker and first 36 bp of the coding region, and then the 3' wtlacl coding region 
through the stop site. 5 f C3 is identical to the endogenous /3-actin promoter up to the 
ATG start site, then contains the original synlacl coding region. This was created by 

15 PCR mutagenesis to remove the linker region present in S and replace the 6 bp 

missing between the Alul site and +1. 5'C4 contains the linker region from W and the 
entire synlacl coding region, with no NLS. This was created by PCR mutagenesis of 
5'C1 to return the four bases in the beginning of the coding region that differ between 
W and S back to the synlacl sequence. 3'C1 contains the wtlacl sequence from the 

20 start of translation up to the EcoRV site at +800 (which W and S have in common) 
and the synlacl sequence after the EcoRV site. The SV40 NLS is attached to the 3' 
end of the 3 ? C1 coding region with the linker region (agcagcctgaggcct; SEQ ID NO: 
16), as described (Fieck et al. Nucleic Acid Res, 20, 1785-1791 (1992)), and was 
created by PCR mutagenesis of the existing NLS linker region described in Scrable 

25 and Stambrook (1997). 3'C2 is identical to 5'C1 up to the EcoRV site, then identical 
to W downstream. 3'C3 is identical to 5'C1 up to the PvwII site at +950 from the start 
of translation, then identical to W downstream. 3'C4 is identical to W upstream of the 
PvwII site, then identical to 3 f Cl downstream. 

Constructs M and R contain the human B-actin promoter blunted at the 

30 Ascl site at 70 followed by the rabbit B-globin intron 2 from the blunted Ncol site 
through the EcoRI site in exon 3. The lad coding region is inserted at the £coRI site. 
The M coding region is identical to W, and the R coding region is identical to 3'C4. 
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The rabbit j8-globin fragment continues downstream of the lad coding region to 
include the rest of exon 3 and the /3-globin 3' untranslated region with a 
polyadenylation signal sequence. The polyA signal sequence from SV40 also is 
present at the 3* end. All of the B-globin sequences and the SV40 poly A signal 
5 sequence are as described (Katsuki etal. Science 271, 1247-1254 (1988)). 

RT-PCR assay for splice site use in lac repressor transcripts 

Total RNA was extracted from testis of W and S transgenic animals, or 
Rat 2 fibroblasts transfected by calcium phosphate with the indicated lad construct 

10 using TRI Reagent (Molecular Research Products, Inc.). RNA was DNase treated 
with RQ1 DNase (Promega) and 1 \ig reverse transcribed with AMV-RT. cDNA was 
subjected to 30 rounds of amplification (95°C for 30 sec, 60°C for 30 sec, 72°C for 1 
min.) using a primer in the /J-actin promoter (acagagcctcgcctttg; SEQ ID NO: 17) and 
a primer in the lad coding sequence (tgcaggcagcttccaca; SEQ ID NO: 18). PCR 

15 products were run on a 4% polyacrylamide gel in IX TBE and transferred to Hybond - 
N+ membrane (Amersham) by semi-dry electrophoresis in NAQ transfer solution 
(0.08 M Tris-HCl, 0.118 M Borate, 2.4 mM EDTA, pH8.3) at 220 mA for 1 h. The 
resultant Southern blot was UV crosslinked and then prehybridized and hybridized 
according to the methods described in Scrable and Stambrook (1997). 

20 

Rat2 transfection assay for lac repressor function 

Rat 2 fibroblasts were transfected with 2.5 ^g pSVOZ DNA (pSVOZ is 
a construct comprising the S V40 early promoter that contains a single, symmetrical 
operator driving the expression of the /3-galactosidase (lacZ) reporter gene, which 

25 contains the endogenous O z operator) and 2.5 (ig of the indicated lac repressor 
construct DNA (or pBSSK carrier DNA) per 3 x 10 5 cells by standard calcium 
phosphate-mediated transfection. Growth media was DMEM, 0.1 units/mL penicillin/ 
0.1 ^ig/mL streptomycin (Life Technologies), 5% FCS (Hyclone); (with 20 mM IPTG, 
if indicated). Two days after transfection, cells were fixed in 0.5% gluteraldehyde, 

30 incubated with X-gal containing solution (0.5mg X-gal in dimethylformamide, 44 mM 
HEPES, 3 mM potassium ferrocyanide, 3 mM potassium ferricyanide, 1 .5 mM NaCl, 
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0. 1 3 mM MgCl 2 at pH > 7) at 37 °°C overnight, and the number of blue cells in each 
well recorded. 



Nucleic acid extraction and blotting 
5 Northern blots were prepared as described in Scrable and Stambrook 

(1997), except that RNA was transferred to Biodyne A nylon membrane. DNA was 
extracted from tail biopsies using the simplified method as described in Laird et al. 
Nucleic Acid Res 19, 4293 (1991). Southern blots were prepared and analyzed 
according to the methods given in Scrable and Stambrook (1997). 

10 

Detection of lac repressor protein by Western blot and immunohistochemistry 

A panel of monoclonal antibodies to the lac repressor was created by 
injecting a Zac/-TrpE fusion protein into mice. For Western blots, total protein was 
extracted into lysis solution (50 mMTris at pH 7.5, 0.15 M NaCl, 1% Nonidet P40), 

15 containing protease inhibitors (0.25% sodium deoxycholate, 1 mM PMSF, 2 mM 

EGTA, 1 \iM leupeptin, 0.2 \iM aprotinin, 0.8 mM N-ethylmalemide, 2 \iM pepstatin 
A). Protein concentration was determined by Lowry's assay, and 30 ng run on a 12% 
SDS-PAGE gel. The proteins were transferred to nitrocellulose membrane with semi- 
dry electrophoresis, and blocked in 5% dried milk in PBS overnight. The blot was 

20 incubated with biotinylated anti-Zac/ antibody 5F8 (25 jig/mL in 1 % BSA/TBST) for 1 
h at 37°C, labeled with peroxidase (ABC reagent, Vector) and visualized with 
chemiluminscence (SuperSignal, Pierce) on a Chemilmager (Alpha Innotech Corp.). 

For immunohistochemistry, mice were given a lethal dose of Nembutol 
sodium, and perfused with 4% paraformaldehyde for 30 min. Tissues were placed in 

25 20% sucrose overnight at 4°C, frozen, sectioned at 30 \im 9 and thaw-mounted onto 
Superfrost Plus (Fisher) slides. Sections were incubated with biotinylaed anti-/ac/ 
antibody 9A5 (3 \xg/mL in 1% BSA/0.3% Triton-XlOO in PBS) overnight at 4°C, 
labeled with peroxidase (ABC reagent, Vector), and visualized with DAB. 



30 



Construction of the regulatable Tyrosinase transgene (Tyr 1 * 0 ) 

The regulatable Tyr 1 * 00 transgene is based on the construct TYBS 
described in Yokoyama et al. Nucleic Acids Res. 18, 7293-7298 (1990). The first lac 
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operator was created by site-directed mutagenesis (ExSite, Stratagene). 25 bp of the 
endogenous promoter sequence (from 72 to 48) was changed to make a 29 bp operator 
centered at 59, identical in sequence to the primary operator of the lac operon 
(gtggaattgt gagcggataacaatttcac; SEQ ID NO: 19) (Lewis et al.1996). Two additional 
5 operators with the same sequence were inserted as part of a 47 bp fragment (agatctgtgg 
aattgtgagc ggataacaat ttcacggatc cagatct; SEQ ID NO: 20) into the BsrGl site at 203 
and the EcoRV site at 548 of the promoter. 

Production of transgenic mice 

10 Production of the W and S lines is described in Scrable and Stambrook 

(1997). The rest of the transgenic lines described were produced by microinjection 
into the outbred ICR line (Harlan) using standard procedures. Two transgenic 
founders were made for the 3'C4 transgene; both showed a testis-only expression 
pattern. One founder line was established for the M construct. Three founders were 

15 transgenic for R; two (lines 1 and 3) exhibited ubiquitous expression, and one (line 13) 
had more limited expression that ranged from low to moderate in various tissues. 
Eight founders were transgenic for Tyr 1 ^ 0 ; an Fl generation was produced from all 
eight, and two of those established pigmented transgenic lines (lines 25 and 43). Of 
the animals indicated as Tyrosinase transgenic, two were homozygous for Tyr 1300 , and 

20 all others were hemizygous for Tyr 18 * 0 . All lad transgenic mice described were 
hemizygous for lacL 

Analysis of eye pigmentation and IPTG treatment of mice 

For adult eyes, mice were given a lethal dose of Nembutol sodium, 

25 perfused transcardially (1 .25% paraformaldehyde, 1 .5% gluteraldehyde, in 0. 1 M 
phosphate at pH 7.4); eyes were dissected out and photographed. They then were 
embedded in parafin, sectioned at 10 /ijim, dewaxed in Xylene, hydrated in decreasing 
concentrations of ethanol, and reacted in cresyl violet (0.5% in 20% ethanol, pH to 2.5 
with glacial-acetic acid) for 8 min, dehydrated, cleared, and mounted in DPX. For 

30 embryonic eyes, pregnant females were euthanized atE12.5, and the embryos 
removed. The head of each embryo was fixed in 2% paraformaldehyde in 0. 1 M 
phosphate (pH 7.4), and the lower half was taken forgenotyping. 
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Tyrosinase activity was derepressed by replacing the drinking water 
with a 10 mM solution of IPTG (changed every four days). To allow for the normal 
pattern of tyrosinase expression during development, the female was started on lOmM 
IPTG at day 7 of pregnancy. 

5 

Example 2 

Regulation is functional during embryogenesis, and reversible 

To determine if lac elements could regulate pigmentation during 
embryogenesis and if IPTG could act transplacental^, the pigmentation in the 

10 developing retinal pigment epithelium of embryonic and newborn mice was 

investigated. At E9, tyrosinase activity in the embryonic eye of wild type mice begins 
to deposit pigment in the developing retinal pigment epithelium. At E12.5, the mouse 
RPE clearly is pigmented. Tyr 1 *^ (43) transgenic mice recapitulate this developmental 
event. A distinct band of pigmentation surrounding the central lens is seen in the 

1 5 Tyrosinase transgenic embryo that is not seen in the nontransgenic albino. The lac 
repressor blocked pigmentation during embryogenesis in the double-transgenic 
embryo, but not when the mother was treated with IPTG during pregnancy. 

These results clearly demonstrate not only that lac regulatory sequences 
function well during embryogenesis, but also that IPTG can cross the placenta to alter 

20 the phenotype of developing animals. Finally, the reversibility of the system was tested 
by switching the Tyrosinase transgene on after it had been off, or off after it had been 
on, in the same animal. The phenotypes of eyes of newborn mice were compared to 
the phenotypes of embryonic eyes of mice of the same genotype. When the mother of 
the IPTG-treated Tyr ,ac0 , Lad double transgenic was not started on IPTG in her 

25 drinking water until El 2.5 of the pregnancy fully pigmented eyes were seen in the 
progeny at birth. However, at El 2.5, this double transgenic pup has the albino 
phenotype. The fully pigmented eyes seen at birth in this animal demonstrate that even 
after a period of silencing, derepression by IPTG was able to switch tyrosinase 
expression on. Reversibility was also observed with regards to switching off the 

30 tyrosinase gene after it had intially been on. When a Tyr 1 ** 0 , LacI double-transgenic 
pup's mother was taken off IPTG at P9, removal of IPTG caused reversion of the coat 
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phenotype to albino. As expected, the eyes remain pigmented due to the low turnover 
of cells and melanosomes in the RPE. 

In summary, by modifying both the target promoter and the gene 
encoding the lac repressor, a regulatory system used in bacteria was successfully 
5 adapted to control the transcription of a gene so that it can function analogously in the 
complex environment of the mouse. The Lad mouse described in this report expresses 
the lac repressor ubiquitously, so it can be used in the future to regulate other 
promoters with the same degree of control demonstrated for the Tyrosinase transgene. 
In addition, it is anticipated that endogenous promoters can be targeted for insertion of 
10 operator elements into the endogenous promoters through homologous recombination. 
This would move the system to the next level, where endogenous loci can be switched 
on and off repeatedly to create reversible models of human disease and normal 
development in the mouse. 

1 5 Materials and methods 

Preparation of primary mouse embryo cells 

Pregnant females were euthanized on day E13.5 (where E0.5 was the 
day a vaginal plug was observed). The embryos were dissected out and a small section 
frozen for genotyping. Embryonic tissue was minced and placed in 2-mL dissociation 

20 solution [2 mg/mL Collagenase B, 2 U/mL RQ1 DNase in RPMI 1640 media 

(GIBCO)] at 37°C for 2h, triturating the solution after 1 h. Cells were spun at 175g, 
washed one time with Hank's BSS, plated with growth media [DMEM,0.1 units/mL 
penicillin/0. 1 fig/mL streptomycin (GIBCO), 10% FCS(Hyclone)], and transfected by 
calcium phosphate. 

25 Analysis of eye pigmentation and IPTG treatment of mice 

For embryonic eyes, pregnant females were euthanized atE12.5, and the 
embryos removed. The head of each embryo was fixed in 2% paraformaldehyde in 0. 1 
M phosphate (pH 7.4), and the lower half was taken for genotyping. Tyrosinase 
activity was derepressed by replacing the drinking water with a 10 mM solution of 

30 IPTG (changed every four days). To allow for the normal pattern of tyrosinase 

expression during development, the female was started on lOmM IPTG at day 7 of 
pregnancy. 



WO 02/086098 PCT/US02/06468 

-41- 

Claims 

1. A nucleic acid sequence comprising the sequence of SEQ ID NO: 1. 

2. The nucleic acid sequence of claim 1 further comprising a nuclear 
5 localization signal. 

3. The nucleic acid sequence of claim 2 wherein the sequence comprises the 
sequence of SEQ ID NO: 2 or SEQ ID NO: 4. 

10 4. The nucleic acid sequence of claim 2 further comprising 

a mammalian promoter that comprises a transcriptional start site; and 
an intron, wherein said promoter is operably linked to said intron and said 
intron is operably linked to the 5' end of sequence of SEQ ID NO: 2, wherein said 
intron provides adequate spacing so that two 100 bp regions located approximately 600 
15 and 800 bp downstream of the transcription start site are devoid of CpG dinucleotides. 

5. The nucleic acid sequence of claim 3 wherein said sequence comprises the 
sequence of SEQ ID NO: 4. 

20 6. The nucleic acid sequence of claim 5 further comprising a promoter 

operably linked to the sequence of SEQ ID NO: 4. 

7. The nucleic acid construct of claim 6 formed as a plasmid. 

25 8. A host cell comprising the nucleic acid construct of claim 2. 

9. The host cell of claim 8 wherein the cell is a eukaryotic cell. 



30 



10. The cell of claim 9 wherein said construct is inserted into the genome of 

the cell. 



WO 02/086098 PCT/US02/06468 

-42- 

1 1 . A non-human transgenic mammal comprising an exogenous DNA 
molecule that is stably integrated in its genome, wherein said exogenous DNA 
molecule comprises the nucleic acid sequence of claim 4. 

5 12. The transgenic mammal of claim 1 1 wherein the exogenous DNA molecule 

comprises a mammalian promoter operably linked to the nucleic acid sequence of SEQ 
ID NO: 4. 

13. The transgenic animal of claim 12 further comprising a nucleic acid 
10 sequence that comprises an operator operably linked to a gene. 

14. A kit for regulating the expression of a gene, said kit comprising 
a first nucleic acid sequence comprising the sequence of claim 4; and 

a second nucleic acid sequence comprising an operator operably linked to a 
15 promoter. 

15. The kit of claim 14 wherein the first nucleic acid sequence comprises a 
promoter operably linked to the sequence of SEQ ID NO: 4; and the second nucleic 
acid sequence further comprises a polylinker operably linked to the promoter of said 

20 second nucleic acid sequence. 

16. The kit of claim 15 wherein said fist and second nucleic acid sequences are 
formed as plasmids. 

25 1 7. A method of regulating the expression of a gene in a transgenic animal, 

said method comprising the steps of 

providing a transgenic animal wherein the cells of said animal comprise a first 

nucleic acid sequence comprising the sequence of claim 4, and a second nucleic acid 

sequence comprising an operator operably linked to said gene; 
30 contacting the cells of said animal in vivo with an inducer of said repressor. 
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18. The method of claim 17 wherein said first nucleic acid sequence comprises 
a mammalian promoter operably linked to the sequence of SEQ ID NO: 4. 

19. The method of claim 18 wherein the step of contacting the cells 
5 comprises administering said inducer orally or intraperitoneal. 

20. A method of regulating the expression of an endogenous gene in vivo 
said method comprising the steps of 

providing a transgenic animal that comprises an operator sequence inserted into 
1 0 an endogenous gene; and 

introducing a repressor encoding nucleic acid sequence, comprising a 
eukaryotic promoter operably linked to the sequence of SEQ ID NO: 4, into that 
transgenic animal or its progeny. 

15 21. The method of claim 20, wherein the operator sequence inserted into an 

endogenous gene by homologous recombination after a nucleic acid sequence 
comprising the operator sequence is introduced into a cell of said animal. 

22. The method of claim 2 1 wherein the step of introducing the repressor 
20 encoding nucleic acid sequence comprises mating a transgenic animal comprising the 
operator containing endogenous gene with a transgenic animal that comprises the 
nucleic acid sequence comprising a eukaryotic promoter operably linked to the 
sequence of SEQ ID NO: 4. 

25 23. An operator targeting construct comprising 

an operator sequence; 

two direct repeats of a site specific recombinase target sequence; and 
a reporter gene construct, said reporter gene construct comprising a reporter 
gene and a 3' splice acceptor site located upstream from said reporter gene, wherein the 
30 operator is operably linked to the reporter gene construct and said reporter gene 

construct is flanked by direct repeats of the site-specific recombinase target sequence. 
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24. The construct of claim 23 wherein the operator sequence is selected from 
the group consisting of SEQ ID NO: 5 or SEQ ID NO: 6; 

the site specific recombinase target sequence is a loxP site or an FRT site. 



5 



25 The construct of claim 24 wherein the operator targeting construct further 
comprises a second operator separated from the first operator by about 150 to about 
200 base pairs of DNA. 
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SEQUENCE LISTING 

<110> The University of Virginia Patent Foundation 
5 Scrable, Heidi 

Cronin, Carolyn 



10 



<120> A Lac Operator-Repressor System 

<130> 00663-02 

<150> 60/281,322 

<151> 2001-04-04 



15 



<150> 60/273,480 
<151> 2001-03-05 

<160> 20 

20 

<170> Patentln version 3.0 

<210> 1 

<211> 1080 

25 <212> DNA 

<213> synthetic construct 

<400> 1 

atgaaaccag taacgttata cgatgtcgca gagtatgccg gtgtctctta tcagactgtt 60 

30 

tccagagtgg tgaaccaggc cagccatgtt tctgccaaaa ccagggaaaa agtggaagca 120 
gccatggcag agctgaatta cattcccaac agagtggcac aacaactggc aggcaaacag 180 
35 agcttgctga ttggagttgc cacctccagt ctggccctgc atgcaccatc tcaaattgtg 240 
gcagccatta aatctagagc tgatcaactg ggagcctctg tggtggtgtc aatggtagaa 300 
agaagtggag ttgaagcctg taaagctgca gtgcacaatc ttctggcaca aagagtcagt 360 

40 

gggctgatca ttaactatcc actggatgac caggatgcca ttgctgtgga agctgcctgc 420 
actaatgttc cagcactctt tcttgatgtc tctgaccaga cacccatcaa cagtattatt 4 80 
45 ttctcccatg aagatggtac aagactgggt gtggagcatc tggttgcatt gggacaccag 540 
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caaattgcac tgcttgcggg cccactcagt 
tggcataaat atctcactag gaatcaaatt 
5 agtgccatgt ctgggtttca acaaaccatg 
gcaatgctgg ttgccaatga tcagatggca 
gggctgagag ttggtgcaga tatctcggta 

10 

tgttatatcc cgccgtcaac caccatcaaa 
gtggaccgct tgctgcaact ctctcagggc 
15 gtctcactgg tgaagagaaa aaccaccctg 
gcattggctg attcactcat gcagctagca 

20 <210> 2 

<211> 1119 

<212> DNA 

<213> synthetic construct 

25 <400> 2 

atgaaaccag taacgttata cgatgtcgca 

tccagagtgg tgaaccaggc cagccatgtt 

30 gccatggcag agctgaatta cattcccaac 

agcttgctga ttggagttgc cacctccagt 

gcagccatta aatctagagc tgatcaactg 

35 

agaagtggag ttgaagcctg taaagctgca 
gggctgatca ttaactatcc actggatgac 
40 actaatgttc cagcactctt tcttgatgtc 
ttctcccatg aagatggtac aagactgggt 
caaattgcac tgcttgcggg cccactcagt 

45 
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tctgtctcag caaggctgag actggccggc 600 
cagccaatag ctgaaagaga aggtgactgg 660 
caaatgctga atgagggcat tgttcccact 720 
ctgggtgcaa tgagagccat tactgagtct 780 
gtgggatacg acgataccga agacagctca 840 
caggattttc gcctgctggg gcaaaccagc 900 
caggcggtga agggcaatca gctgttgcca 960 
gcacccaata cacaaactgc ctctccccgg 1020 
agacaggttt ccagactgga aagtgggcag 1080 



gagtatgccg gtgtctctta tcagactgtt 60 

tctgccaaaa ccagggaaaa agtggaagca 12 0 

agagtggcac aacaactggc aggcaaacag 180 

ctggccctgc atgcaccatc tcaaattgtg 240 

ggagcctctg tggtggtgtc aatggtagaa 300 

gtgcacaatc ttctggcaca aagagtcagt 360 

caggatgcca ttgctgtgga agctgcctgc 420 

tctgaccaga cacccatcaa cagtattatt 480 

gtggagcatc tggttgcatt gggacaccag 54 0 

tctgtctcag caaggctgag actggccggc 600 
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tggcataaat atctcactag gaatcaaatt cagccaatag ctgaaagaga aggtgactgg 660 
agtgccatgt ctgggtttca acaaaccatg caaatgctga atgagggcat tgttcccact 720 
5 gcaatgctgg ttgccaatga tcagatggca ctgggtgcaa tgagagccat tactgagtct 780 
gggctgagag ttggtgcaga tatctcggta gtgggatacg acgataccga agacagctca 840 
tgttatatcc cgccgtcaac caccatcaaa caggattttc gcctgctggg gcaaaccagc 900 

10 

gtggaccgct tgctgcaact ctctcagggc caggcggtga agggcaatca gctgttgcca 960 
gtctcactgg tgaagagaaa aaccaccctg gcacccaata cacaaactgc ctctccccgg 1020 
15 gcattggctg attcactcat gcagctagca agacaggttt ccagactgga aagtgggcag 1080 
agcagcctga ggcctcccaa gaagaagcga aaggtgtga 1119 

20 <210> 3 

<211> 372 
<212> PRT 

<213> Escherichia coli 
25 <400> 3 

Met Lys Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala Gly Val Ser 
1.5 10 15 

Tyr Gin Thr Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala 
30 20 25 30 

Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr lie 
35 40 ' 45 

35 Pro Asn Arg Val Ala Gin Gin Leu Ala Gly Lys Gin Ser Leu Leu lie 
50 55 60 

Gly Val Ala Thr Ser Ser Leu Ala Leu His Ala Pro Ser Gin lie Val 
65 70 75 80 

40 

Ala Ala He Lys Ser Arg Ala Asp Gin Leu Gly Ala Ser Val Val Val 
85 90 95 

Ser Met Val Glu Arg Ser Gly Val Glu Ala Cys Lys Ala Ala Val His 
45 100 105 110 
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Asn Leu Leu Ala Gin Arg Val Ser Gly Leu lie lie Asn Tyr Pro Leu 
115 120 125 

Asp Asp Gin Asp Ala lie Ala Val Glu Ala Ala Cys Thr Asn Val Pro 
5 130 135 140 

Ala Leu Phe Leu Asp Val Ser Asp Gin Thr Pro lie Asn Ser lie lie 
145 150 155 160 

10 Phe Ser His Glu Asp Gly Thr Arg Leu Gly Val Glu His Leu Val Ala 

165 170 175 

Leu Gly His Gin Gin lie Ala Leu Leu Ala Gly Pro Leu Ser Ser Val 
180 185 190 

15 

Ser Ala Arg Leu Arg Leu Ala Gly Trp His Lys Tyr Leu Thr Arg Asn 
195 200 205 

Gin lie Gin Pro lie Ala Glu Arg Glu Gly Asp Trp Ser Ala Met Ser 
20 210 215 220 

Gly Phe Gin Gin Thr Met Gin Met Leu Asn Glu Gly He Val Pro Thr 
225 230 235 240 

25 Ala Met Leu Val Ala Asn Asp Gin Met Ala Leu Gly Ala Met Arg Ala 

245 250 255 

He Thr Glu Ser Gly Leu Arg Val Gly Ala Asp He Ser Val Val Gly 
260 265 270 

30 

Tyr Asp Asp Thr Glu Asp Ser Ser Cys Tyr He Pro Pro Ser Thr Thr 
275 280 285 

He Lys Gin Asp Phe Arg Leu Leu Gly Gin Thr Ser Val Asp Arg Leu 
35 290 295 300 

Leu Gin Leu Ser Gin Gly Gin Ala Val Lys Gly Asn Gin Leu Leu Pro 
305 310 315 320 

40 Val Ser Leu Val Lys Arg Lys Thr Thr Leu Ala Pro Asn Thr Gin Thr 

325 330 335 



45 



Ala Ser Pro Arg Ala Leu Ala Asp Ser Leu Met Gin Leu Ala Arg Gin 
340 345 350 
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Val Ser Arg Leu Glu Ser Gly Gin Ser Ser Leu Arg Pro Pro Lys Lys 
355 360 365 

Lys Arg Lys Val 
5 370 



■ <210> 4 
<211> 2163 
10 <212> DNA 

<213> synthetic construct 

<400> 4 

catggaccct catgataatt ttgtttcttt cactttctac tctgttgaca accattgtct 60 

15 

cctcttattt tcttttcatt ttctgtaact ttttcgttaa actttagctt gcatttgtaa 120 

cgaattttta aattcacttt tgtttatttg tcagattgta agtactttct ctaatcactt 180 

20 ttttttcaag gcaatcaggg tatattatat tgtacttcag cacagtttta gagaacaatt 240 

gttataatta aatgataagg tagaatattt ctgcatataa attctggctg gcgtggaaat 300 

.attcttattg gtagaaacaa ctacatcctg gtcatcatcc tgcctttctc tttatggtta 360 

25 

caatgatata cactgtttga gatgaggata aaatactctg agtccaaacc gggcccctct 42 0 

gctaaccatg ttcatgcctt cttctttttc ctacagctcc tgggcaacgt gctggttgtt 480 

30 gtgctgtctc atcattttgg caaagatgaa accagtaacg ttatacgatg tcgcagagta 54 0 

tgccggtgtc tcttatcaga ctgtttccag agtggtgaac caggccagcc atgtttctgc 600 

caaaaccagg gaaaaagtgg aagcagccat ggcagagctg aattacattc ccaacagagt 660 

ggcacaacaa ctggcaggca aacagagctt gctgattgga gttgccacct ccagtctggc 720 

cctgcatgca ccatctcaaa ttgtggcagc cattaaatct agagctgatc aactgggagc 780 

40 ctctgtggtg gtgtcaatgg tagaaagaag tggagttgaa gcctgtaaag ctgcagtgca 84 0 

caatcttctg gcacaaagag tcagtgggct gatcattaac tatccactgg atgaccagga 900 

tgccattgct gtggaagctg cctgcactaa tgttccagca ctctttcttg atgtctctga 960 

45 
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ccagacaccc atcaacagta ttattttctc 

gcatctggtt gcattgggac accagcaaat 

5 ctcagcaagg ctgagactgg ccggctggca 

aatagctgaa agagaaggtg actggagtgc 

gctgaatgag ggcattgttc ccactgcaat 

10 

tgcaatgaga gccattactg agtctgggct 

atacgacgat accgaagaca gctcatgtta 

15 ttttcgcctg ctggggcaaa ccagcgtgga 

ggtgaagggc aatcagctgt tgccagtctc 

caatacacaa actgcctctc cccgggcatt 

20 

ggtttccaga ctggaaagtg ggcagagcag 

gtgaaattca ctcctcaggt gcaggctgcc 

25 gccctggctc acaaatacca ctgagatctt 

atgaagcccc ttgagcatct gacttctggc 

gtgtgttgga attttttgtg tctctcactc 

30 

aaacatcaga atgagtattt ggtttagagt 

tgaacaaagg tggctataaa gaggtcatca 

35 cttattccat agaaaagcct tgacttgagg 

atttttttct ttaacatccc taaaattttc 

cctctcctga ctactcccag tcatagctgt 

40 

cag 
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ccatgaagat ggtacaagac tgggtgtgga 1020 

tgcactgctt gcgggcccac tcagttctgt 1080 

taaatatctc actaggaatc aaattcagcc 1140 

catgtctggg tttcaacaaa ccatgcaaat 1200 

gctggttgcc aatgatcaga tggcactggg 1260 

gagagttggt gcagatatct cggtagtggg 1320 

tatcccgccg tcaaccacca tcaaacagga 138 0 

ccgcttgctg caactctctc agggccaggc 1440 

actggtgaag agaaaaacca ccctggcacc 1500 

ggctgattca ctcatgcagc tagcaagaca 1560 

cctgaggcct cccaagaaga agcgaaaggt 1620 

tatcagaagg tggtggctgg tgtggccaat 1680 

tttccctctg ccaaaaatta tggggacatc 174 0 

taataaagga aatttatttt cattgcaata 1800 

ggaaggacat atgggagggc aaatcattta 1860 

ttggcaacat atgccatatg ctggctgcca 1920 

gtatatgaaa cagccccctg ctgtccattc 1980 

ttagattttt tttatatttt gttttgtgtt 2040 

cttacatgtt ttactagcca gatttttcct 2100 

ccctcttctc ttatgaagat cttattaaag 2160 

2163 
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<210> 5 
<211> 30 
<212> DNA 

<213> synthetic construct 
<400> 5 

tgtggaattg tgagcgctca caattccaca 30 



10 <210> 6 
<211> 18 
<212> DNA 

<213> synthetic construct 
15 <400> 6 

attgtgagcg ctcacaat 18 



<210> 7 

20 <211> 7 

<212> PRT 

<213> Simian virus 40 

<400> 7 

25 Pro Lys Lys Lys Arg Lys Val 
1 5 



<210> 8 

30 <211> 5 

<212> PRT 

<213> Adenovirus type 37 

<400> 8 
35 Lys Arg Pro Arg Pro 
1 5 



<210> 9 

40 <211> 27 

<212> DNA 

<213> Escherichia coli 



<400> 9 

45 atgaaaccag taacgttata cgatgtc 



27 
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<210> 10 

<211> 39 

<212> DNA 

<213> synthetic construct 

5 

<220> 

<221> transit_peptide 

<222> (1)..(39) 

10 <400> 10 

agcagcctga ggcctcccaa gaagaagcga aaggtgtga 39 



<210> 11 
15 <211> 2294 
<212> DNA 

<213> synthetic construct 
<400> 11 

20 catggaccct catgataatt ttgtttcttt cactttctac tctgttgaca accattgtct 60 
cctcttattt tcttttcatt ttctgtaact ttttcgttaa actttagctt gcatttgtaa 120 
cgaattttta aattcacttt tgtttatttg tcagattgta agtactttct ctaatcactt 180 

25 

ttttttcaag gcaatcaggg tatattatat tgtacttcag cacagtttta gagaacaatt 240 
gttataatta aatgataagg tagaatattt ctgcatataa attctggctg gcgtggaaat 300 
30 attcttattg gtagaaacaa ctacatcctg gtcatcatcc tgcctttctc tttatggtta 360 
caatgatata cactgtttga gatgaggata aaatactctg agtccaaacc gggcccctct 420 
gctaaccatg ttcatgcctt cttctttttc ctacagctcc tgggcaacgt gctggttgtt 480 

35 

gtgctgtctc atcattttgg caaagatgaa accagtaacg ttatacgatg tcgcagagta 540 
tgccggtgtc tcttatcaga ctgtttccag agtggtgaac caggccagcc atgtttctgc 600 
40 caaaaccagg gaaaaagtgg aagcagccat ggcagagctg aattacattc ccaacagagt 660 
ggcacaacaa ctggcaggca aacagagctt gctgattgga gttgccacct ccagtctggc 720 
cctgcatgca ccatctcaaa ttgtggcagc cattaaatct agagctgatc aactgggagc 780 

45 
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ctctgtggtg gtgtcaatgg tagaaagaag tggagttgaa gcctgtaaag ctgcagtgca 840 
caatcttctg gcacaaagag tcagtgggct gatcattaac tatccactgg atgaccagga 900 

5 tgccattgct gtggaagctg cctgcactaa tgttccagca ctctttcttg atgtctctga 960 

ccagacaccc atcaacagta ttattttctc ccatgaagat ggtacaagac tgggtgtgga 1020 

gcatctggtt gcattgggac accagcaaat tgcactgctt gcgggcccac tcagttctgt 1080 

10 

ctcagcaagg ctgagactgg ccggctggca taaatatctc actaggaatc aaattcagcc 1140 

aatagctgaa agagaaggtg actggagtgc catgtctggg tttcaacaaa ccatgcaaat 1200 

15 gctgaatgag ggcattgttc ccactgcaat gctggttgcc aatgatcaga tggcactggg 1260 

tgcaatgaga gccattactg agtctgggct gagagttggt gcagatatct cggtagtggg 13 20 

atacgacgat accgaagaca gctcatgtta tatcccgccg tcaaccacca tcaaacagga 1380 

20 

ttttcgcctg ctggggcaaa ccagcgtgga ccgcttgctg caactctctc agggccaggc 1440 

ggtgaagggc aatcagctgt tgccagtctc actggtgaag agaaaaacca ccctggcacc 1500 

25 caatacacaa actgcctctc cccgggcatt ggctgattca ctcatgcagc tagcaagaca 1560 

ggtttccaga ctggaaagtg ggcagagcag cctgaggcct cccaagaaga agcgaaaggt 1620 

gtgaaattca ctcctcaggt gcaggctgcc tatcagaagg tggtggctgg tgtggccaat 1680 

30 

gccctggctc acaaatacca ctgagatctt tttccctctg ccaaaaatta tggggacatc 1740 

atgaagcccc ttgagcatct gacttctggc taataaagga aatttatttt cattgcaata 1800 

35 gtgtgttgga attttttgtg tctctcactc ggaaggacat atgggagggc aaatcattta 1860 

aaacatcaga atgagtattt ggtttagagt ttggcaacat atgccatatg ctggctgcca 1920 

tgaacaaagg tggctataaa gaggtcatca gtatatgaaa cagccccctg ctgtccattc 1980 

40 

cttattccat agaaaagcct tgacttgagg ttagattttt tttatatttt gttttgtgtt 2040 

atttttttct ttaacatccc taaaattttc cttacatgtt ttactagcca gatttttcct 2100 

45 cctctcctga ctactcccag tcatagctgt ccctcttctc ttatgaagat cttattaaag 2160 
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cagtaacttg tttattgcag cttataatgg ttacaaataa agcaatagca tcacaaattt 2220 



cacaaataaa gcattttttt cactgcattc tagttgtggt ttgtccaaac tcatcaatgt 2280 
5 atcttatcat gtct 2294 

<210> 12 

<211> 277 

10 <212> DNA 

<213> synthetic construct 

<400> 12 

ggatcctcta cgccggacgc atcgtggccg gcatcaccgg cgccacaggt gcggttgctg 60 

15 

gcgcctatat cgccgacatc accgatgggg aagatcgggc tcgccacttc gggctcatga 120 

gcgcttgttt cggcgtgggt atggtggcag gccccgtggc cgggggactg ttgggcgcca 180 

20 tctccttgca tgcaccattc cttgcggcgg cggtgctcaa cggcctcaac ctactactgg 240 

gctgcttcct aatgcaggag tcgcataagg gagagcg 277 

25 <210> 13 
<211> 44 
<212> DNA 

<213> synthetic construct 
30 <400> 13 

gatcagtcga cctgcagccc aagcttgata tcgaattcgg atct 44 



<210> 14 

35 <211> 30 

<212> DNA 

<213> synthetic construct 



<400> 14 

40 gatcagtcga cctgcagccc aagcttcacc 30 
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<210> 15 
<211> 25 
<212> DNA 

<213> synthetic construct 
<400> 15 

taggatcccc gggctgcagg aattc 25 



10 <210> 16 
<211> 15 
<212> DNA 

<213> synthetic construct 
15 <400> 16 

agcagcctga ggcct 15 



<210> 17 

20 <211> 17 

<212> DNA 

<213> Artificial 



<220> 

25 <223> PCR primer for the beta-act in promoter 
<220> 

<221> primer_bind 
<222> (1) . . (17) 

30 

<400> 17 

acagagcctc gcctttg 17 



35 <210> 18 

<211> 17 

<212> DNA 

<213> Artificial 



40 <220> 

<223> PCR primer for the lad coding sequence 
<220> 

<221> primer Jbind 

<222> (1)..(17) 



WO 02/086098 



PCT/US02/06468 



-12- 

<400> 18 

tgcaggcagc ttccaca 17 



5 <210> 19 
<211> 29 
<212> DNA 

<213> synthetic construct 
10 <400> 19 

gtggaattgt gagcggataa caatttcac 29 



<210> 20 

15 <211> 47 

•<212> DNA 

<213> synthetic construct 

<400> 20 

20 agatctgtgg aattgtgagc ggataacaat ttcacggatc cagatct 47 



25 



